Paul Wang
Solution-Soft
2345 North First Street, Suite 210
San Jose, CA 95131
pwang@solution-soft.com
408.346.1400
You know your company's critical assets are the files on the systems: customer profiles, e-mails, payrolls, order entries, inventories, and account receivables, etc. Without having all your information readily available, you won't stay competitive for long. So, one thing is for certain. You can't let your systems run out of disk space!
On average 30 to 60 percent of disk space on a typical system is wasted: Some free space is too small to be useful, some disk space is allocated but never utilized, some disk space is allocated and utilized, but will never be accessed.
This paper explains how disk space is wasted and how wasted disk space can be reclaimed automatically and systematically. Finally, it discusses the correct way to add new disks when the time comes.
Companies worldwide rely more and more on information systems to run their operations. This has led to an explosion of the data stored on computer systems. There is no avoiding it: the demand for disk space rarely keeps up with the requirement for it.
On one hand, new technologies such as Data Warehousing, Workflow Automation, Document Imaging, graphics and multimedia all use large amounts of disk space and ensure that storage requirements will continue to grow at unprecedented rates. On the other hand, the price-per-megabyte of disk space continues to drop, providing more disk space for the same price and leading applications to use more data and keep it around longer. Finally, government and agency auditing requirements often mandate that 5, 7 or 15 years of operational data is available. This requirement further magnifies the data explosion problem.
Adding more disk space is just a temporary solution to a permanent problem. Disk space will continue to grow, consuming hardware budgets, and becoming more difficult to manage and more time-consuming to back adequately up.
A big part of the solution is to better manage your disk space automatically. Once those automated procedures are more efficiently set up, you can be sure your MPE/iX disk space will continue to be utilized.
MPE/iX systems allocate disk space in extents. An extent is a chunk of contiguous disk space. The size of an extent is always in multiples of 4 kbytes and ranges from 4 kbytes to 4 gigabytes. A file may consist of zero, one, or an unlimited number of extents.
Due to the file system allocation algorithm and the way that some subsystems allocate disk space, some free space is too small to be useful and some disk space is allocated, but never utilized. In addition, just by the nature of the data, some disk space is allocated and utilized, but will either never be accessed again. All those types of disk space are wasted and they add up really quickly. It is intriguing to note that on average 30 to 60 percent of disk space on an MPE/iX system is wasted.
The good news is that most of the wasted disk space can be saved automatically. Automatically, which means you can “set and forget”. Just spend some time setting it up once, and you can forget about it and count on the disk space savings will continue.
Let’s examine how disk space is wasted and how certain techniques can reclaim space.
Since extents are of different sizes, as extents are allocated and deallocated, the average size of free space tends to become smaller and very tiny fragments of free space may be created. This is called disk fragmentation (see Figure 1). Severe disk fragmentation not only decreases performance, but also wastes disk space. Any free fragment less than 64 kbytes is wasted, since the file system tries to allocate disk space in at least 64 kbyte chunks. It is common to see tens of megabytes of such free space wasted on user systems.
Figure 1. Disk defragmentation creates bigger and more useful free fragments
Free fragments between 64 and 512 kbytes can be utilized by the file system, but they are less useful. When the allocation request is bigger than 16 megabytes, the file system makes sure each extent allocated is at least 512 kbytes. On a severely fragmented system, this can cause premature out of disk space errors where the allocation fails despite the fact that the system has much more free space than the request. This is one of the most common mysteries in the user community. In this case, the system is not out of free space. Rather, it is out of free fragments bigger than 512 kbytes.
The solution is to proactively defragment the disks. Rather than waiting for the fragmentation to accumulate and waste disk space, it is better to keep the disks clean at all times. A general recommendation is to defragment all disks at least once a week. This can be easily automated into daily or weekly batch processing.
It is common to have wasted disk space beyond the file EOF due to file system dynamic allocation (extent fault) and EOF cutback. Since the file system has no idea where the EOF will be, it just allocates disk space as it goes. The EOF usually will be set somewhere in the middle of the last extent. The potential wastage per file is 4 to 512 Kbytes. Users can also cut back EOF so that any disk space allocated between the old and the new EOF are wasted (see Figure 2). For example, files opened with write access will reset the EOF to zero. If a huge work file or scratch file (with a million or so records) is reused this way with tiny data (e.g. ten records), lots of valuable disk space is wasted. The potential wastage per file depends on the file limit. Wasted disk space adds up really fast. In my experience, it is not uncommon to see thousands or even millions of sectors wasted this way on a production system!
Figure 2. Truncation reclaims wasted disk space beyond EOF
File truncation is probably the easiest and most effective way to reclaim wasted disk space. However, it does have the side effect of increased disk fragmentation. It is recommended that users truncate all files periodically. The best way to achieve this is to incorporate file truncation and disk defragmentation (in this order) into nightly or weekly batch jobs. Thus, disks will be cleaned automatically.
The regular truncation beyond EOF has no effects on the IMAGE/SQL files, since IMAGE/SQL subsystem never allocates any disk space beyond EOF. However, any disk space allocated beyond the high-water mark is wasted. This can add up really quickly.
If Dynamic Detail dataset eXpansion (DDX) is not enabled, all disk space allocated between the high-water mark and the file limit (or full capacity) will be wasted. Since users typically keep free space at 10 to 35 percent of capacity, the amount of wasted disk space for IMAGE/SQL datasets could be tremendous.
Enabling DDX with a moderate, incremental size will dramatically reduce the amount of wasted space. Even though disk space allocated beyond the high water mark for a DDX-enabled file is still wasted, it is limited to the incremental size designated (see Figure 3). Please note that Jumbo the dataset does not support DDX. As a result, gigabytes worth of disk space may be wasted just by one Jumbo dataset.
Figure 3. Truncation of IMAGE/SQL datasets reclaims wasted disk space beyond the high water mark
If you can tolerate a DBPUT occasionally producing an out of disk space error (the trade-off for not preallocating all disk space), then truncating all IMAGE/SQL files periodically to the high-water mark will reclaim the wasted disk space. This can be easily automated into weekly or monthly batch processing.
The amount of disk space allocated for infrequently used files is enormous. Just take a look at how many files have not been accessed for a month to verify this. The 20/80 rule is at work here: 80 percent of the processing is done on only 20 percent of the data. This means that 80 percent of your data is used less often if at all. (see Figure 4).
Figure 4. Compress infrequently used data by rules to save disk space
Half, or possibly more, of wasted disk space could be reclaimed if those infrequently-used files are compressed. The actual compression ratio depends on the data. In general, MPE/iX files have a very high compression ratio, because the file system and database subsystem tends to pad blanks or zeros into big fixed length records. In addition, many files are not 100% full in capacity. It is very common to see a 50 to 95 percent compression ratio for ASCII, KSAM, IMAGE/SQL, ALLBASE/SQL and Oracle files. On the other hand, program and library files are much harder to compress. Object codes have less repeated patterns and dead space; the compression ratio for those files is approximation 10 to 50 percent.
The best way to set up automatic compression is to establish compression policies on various accounts (e.g., based on last accessed date). Then compress files in batch periodically and automatically according to the policies. Compression not only saves disk space, but also speeds up backups since less data is stored. If your data is 50 percent compressed, then a full backup will be 50 percent faster! The same benefits apply to restore, reload and disaster recovery. If your data is 50 percent compressed, then a reload after a disk crash is 50 percent faster. That corresponds directly to less system downtime.
Of course, when users need to access those compressed files, they must be decompressed first. This can become a big management headache if it occurs frequently and a system manager/operator/help desk personnel must be involved each time. The best solution is to utilize an online archiving tool that supports auto-decompression so that compressed files are automatically decompressed as soon as they are accessed. Decompression is very fast (typically 1 to 2 megabytes per second), in most cases, users won’t even notice they are accessing a compressed file.
The amount of disk space allocated for unused files is also quite large. There are probably many files that have not been accessed for more than a year.
Any disk space allocated for unused files is wasted. It is unrealistic to expect all users will review and clean up unused files all the time. Unless global clean-up procedures are in place, those unused files will never go away. Global clean-up procedures may involve defining the clean-up policies for various accounts, or file types. After a defined period of idle time, owners of the files may be notified (preferably by e-mail automatically). Finally, after a second period of idle time, those files may be cleaned up by purging and/or archiving to tape.
Certain types of files can be safely purged, such as temporary work files. For other files, it is best to archive (i.e., purge and store) them on tapes. If they are needed later, they can be restored easily.
Most users just add a new disk whenever the amount of free disk space is low. With the automated techniques discussed in this paper, you will be able to save valuable disk space and delay new disk purchases. However, you will probably still have to add a new disk at some point. It is important manage that process in order to avoid a I/O bottleneck!
MPE/iX extent placement algorithm places new extents on the disk with free space. When a new disk is added, it will be THE place where new extents are allocated. Due to data locality, these new data tend to be the most active data. Worse yet, if any performance-critical databases are restored or reorganized, then they will be allocated solely onto the new disk (see Figure 5). Clearly the I/O demand for the new disk may easily exceed its bandwidth. As a result, system throughput and response time may suffer.
Figure 5. Adding a new disk can easily become an I/O bottleneck
It is important to populate the volume set in order to rebalance the disk allocation immediately after adding a new disk. This will not only prevent the new disk from becoming an I/O bottleneck, but also improve system performance by increasing I/O concurrency across all disks in the volume set.
Another potential performance problem is the I/O channel. When adding a new disk, make sure that the number of disks and other devices on the channel does not exceed the channel bandwidth.
The data explosion will continue to generate more pressure on storage hardware capacity. As a result, disk space usage will continue to grow, consuming hardware budgets, and becoming more difficult to manage and more time-consuming to backup.
A big part of the solution is to better manage your disk space automatically. By truncating files and defragmenting disks regularly, you can proactively keep your disks clean. The unused space allocated for IMAGE/SQL datasets can be huge. An automated truncation process will keep them performing optimally. Finally, compressing and archiving files according to established archiving policies is the best way to save disk space and speed up backup/restore.
From my experience, saving 50 percent of disk space is very achievable if he automated techniques discussed are utilized. I think you will be pleasantly surprised to find how easily disk space can be saved on a regular basis. Thank you for your interest in this topic, and please email me with any interesting stories or experiences related to this topic.
Paul Wang, President and Founder of SolutionSoft Systems, Inc., is a recognized speaker on Year 2000 compliance and data storage management. Mr. Wang developed Time Machine, a Year 2000 testing tool, at the request of Hewlett-Packard. Solution-Soft's customers include Fortune 500 companies such as AT&T, Lucent, Ford, Boeing, and 3M. Mr. Wang is a renowned consultant on data storage management, and possesses more than 12 years experience with system utilities. Prior to Solution-Soft, Mr. Wang was a Software Architect at Hewlett-Packard, where two of his software designs were patented. He also represented HP at various industry standard organizations, such as OSF, SPIRIT and TPC. Mr. Wang holds a Masters in Computer Science from Stanford University.