NTFS is designed to be efficient, but it isn’t foolproof. To avoid seriously fragmented disks, you’ll need careful planning and regular maintenance.
        
        All Broken Up
        NTFS is designed to be efficient, but it isn’t foolproof. To avoid seriously fragmented disks, you’ll need careful planning and regular maintenance.
        
        
			- By Michael Chacon
- July 01, 1999
We spend unconscionable amounts of money making sure 
        that Windows NT has all the resources it needs—and it 
        needs plenty. Tons of RAM, multiple fast processors, striped 
        RAID sets with fast, fat, and wide SCSI drives are all 
        important components of a speed machine. But there other 
        things you can do to keep performance up to snuff without 
        blowing your budget. For example, one of the most effective 
        yet unexciting tasks you can regularly perform on your 
        NT machines is to defragment the disk drives.
      In many cases you probably already have a disk defragmentation 
        program installed. If you don’t, you need to correct that 
        mistake immediately. Regardless of your current situation, 
        I’m going to discuss some of the low-level details of 
        why defragmentation is important and how that process 
        is accomplished. 
      Inside NTFS
      Before I discuss fragmentation solutions, let’s review 
        how data is allocated to disks in NT. I hope all your 
        volumes are formatted with NTFS rather than FAT, for reasons 
        I’ve outlined in these pages many times. As with most 
        file systems, NTFS is contained in a volume, which is 
        a logical partition on a physical disk—and, of course, 
        there can be multiple partitions on one disk. Unlike FAT, 
        which contains areas specifically formatted for use by 
        the various components of the file system, NTFS stores 
        all system files, including the Master File Table (MFT) 
        and the bootstrap file, as ordinary files.
      As with the FAT file system, NTFS uses clusters to allocate 
        disk space. The size of the cluster is determined during 
        the format process and can range from 512 bytes up to 
        64K. The default cluster size for most disks today is 
        4K to support large partitions, avoid wasting disk space, 
        and minimize disk fragmentation (see Table 1). Also, keep 
        in mind that NTFS file compression isn’t supported on 
        any partition with a cluster size greater than 4K.
      
         
          | Table 1. Default cluster 
              sizes in NTFS  
               
                | 
                     
                      | NTFS 
                        volume size | Default 
                        Cluster size |   
                      | Up to 512M | 512 bytes (or the 
                        sector size if > 512 bytes |   
                      | Larger than 512M 
                        and up to 1G | 1K |   
                      | Larger than 1M and 
                        up to 2G | 2K |   
                      | Larger than 2G | 4K |  |    | 
      
      Here’s an extreme example to illustrate the point. Let’s 
        say you have 5,000 files that are each 2K in size. On 
        a partition with 2K clusters, they’d consume 10M of disk 
        space (multiply 5,000 by 2,000 and you get 10M), with 
        each file fitting neatly in each cluster. Theoretically, 
        there wouldn’t be any wasted space or fragmentation. If 
        you copied those same files to a partition with 64K clusters, 
        they’d be allocated in 320M (5,000 x 64,000 = 320M) of 
        disk space with no cluster fragmentation but with massive 
        internal fragmentation, otherwise known as wasted space. 
        NTFS doesn’t concern itself with sector sizes and uses 
        a minimum of one complete cluster for each file, hence 
        the wasted space in the example. The sector size for hard 
        drives is determined when the drive is originally low-level 
        formatted and the tracks on the disk are broken up into 
        sectors. 
      Avoiding Disk Fragmentation
      On the other hand, if you use the same two partitions 
        and one 10M file, you’ll have something else to consider: 
        fragmentation. With the 64K cluster size, the 10M file 
        will be allocated just under 160 clusters, while the 2K 
        cluster partition would allocate a whopping 5,000 clusters. 
        The more clusters needed to store a file, the more likely 
        the clusters won’t be located contiguously on the partition. 
        This lack of continuity means that the read/write head 
        of the physical disk has to move more often to access 
        any given file.
      Because the read/write operation of a disk drive is the 
        slowest point in the disk access process, keeping file 
        fragmentation to a minimum can play a significant role 
        in system performance. When reading a sequential file 
        in one physical read operation, the system can use read-ahead 
        to extract more of the file’s data and keep it in cache 
        for later retrieval. Extracting this data from cache the 
        next time it’s needed is much faster than performing another 
        physical read. Obviously, in the real world systems don’t 
        have uniformly sized files, but you get the point. The 
        lack of uniformity in file size makes choosing the cluster 
        size to avoid fragmentation a very poor strategy.
      Making things more complex for us but more flexible for 
        the file system, there are two types of clusters within 
        NTFS: Logical Cluster Numbers (LCNs) and Virtual Cluster 
        Numbers (VCNs). The LCNs are directly mapped to a physical 
        disk address by multiplying the cluster size of the partition 
        by a given sequential LCN. This provides an offset measure 
        in the number of bytes that the disk driver uses to read 
        and write data—very low-level stuff. VCNs map individual 
        files to LCNs using a series of sequential numbers incremented 
        for as many clusters as needed to contain the file. NTFS 
        uses VCNs to store files, and then VCNs use LCNs to allocate 
        the information to the disk. 
      Consistency is Key
      The core of any NTFS volume is the Master File Table 
        (MFT), which is implemented as a file containing an array 
        of 1K records, regardless of sector size, and each of 
        which represents a file within the partition. Each 1K 
        segment of the array contains attributes for the file, 
        such as the security descriptor, filenames, timestamp, 
        and interestingly enough, the data. I call this interesting 
        because storing the data as just another file attribute 
        helps give NTFS a consistent architecture. If the data 
        fits within the 1K record, it’s stored in the MFT and 
        referred to as a resident attribute. Obviously very few 
        files are this span, so there’s also a nonresident attribute, 
        otherwise referred to as a “run,” that’s stored in the 
        next available clusters. As a file grows in size, more 
        runs are allocated to contain the additional data. Although 
        this process is usually associated with data files, any 
        attribute that can grow is handled in the same manner. 
        For example, if many users have permissions to files individually 
        rather than through group membership, the Discretionary 
        Access Control Lists (DACLs) can grow too large to remain 
        resident, in which case they’ll be allocated in a run.
      Another example of non-resident file attributes being 
        stored in runs is a directory with a large number of files. 
        Directories are listed in the MFT like other files except 
        that they have an index root attribute containing a list 
        of the files associated with the directory. If the index 
        of files can’t be contained in the MFT record, a run is 
        created to allocate the overflowing information in as 
        many clusters as necessary to contain the filenames and 
        their associated VCN-to-LCN mappings. Such a consistent 
        approach to treating all information as attributes and 
        any increasing information as runs helps NTFS remain flexible 
        as different data types are created for future applications. 
        Regardless of its source or destination, data is simply 
        stored in attribute streams. NTFS doesn’t need to be concerned 
        with data types—it leaves that issue to higher-level application 
        processes. 
      Metadata Files
      Along with the MFT is another set of files that complete 
        the NTFS structure: metadata files. These files use a 
        $filename naming convention, and each has a particular 
        function in the file system. During the NT boot process, 
        the kernel loads all the device drivers, including the 
        NTFS file system driver. During the volume mounting process, 
        the NTFS system driver looks for the $Boot file, which 
        contains the bootstrap code. The $Boot file is created 
        during the formatting process and is located at a specific 
        disk address. This file locates the physical disk address 
        of the $MFT, which contains the VCN-to-LCN information, 
        to obtain all of the MFT file attributes and MFT runs. 
        The first record in the MFT contains the attributes of 
        the MFT. In this manner the MFT first references itself, 
        then all other files in the partition. The second record 
        in the MFT contains the attributes of a partial copy of 
        the MFT, called $MFTMirr, which is a file placed in the 
        middle of the partition away from the MFT for redundancy 
        purposes. Because these are normal files, you can see 
        them with the DIR command (see Figure 1). 
      
         
          |  | 
         
          | Figure 1. Metadata files 
            use a $filename naming convention and can be viewed 
            using the DIR command. | 
      
      You can use the $MFTMirr file to locate the metadata 
        files if the MFT is somehow corrupt or missing. By implementing 
        the MFT as a normal file that references itself, NTFS 
        eliminates the need to locate it in any particular area 
        of the partition. This means that NTFS can relocate the 
        MFT file if it encounters a bad cluster or other disk 
        error. Two other interesting files are the $BadClus file, 
        which keeps a record of bad clusters on the disk; and 
        $Volume, which records the name, NTFS version number, 
        and corrupted disk bit—meaning it requires CHKDSK to be 
        run against it. 
      
         
          | 
               
                | 
                     
                      | Additional 
                        Information |   
                      | 
                          A great reference that delves into 
                            the NTFS internals even further is 
                            David A. Solomon's Inside Windows 
                            NT, Second Edition, Microsoft 
                            Press, ISBM 1-57231-677-2. Chapter 
                            9 covers NTFS. |  |    | 
      
      Protecting System Files
      One of the most compelling architectural benefits of 
        NTFS is its ability to provide transaction-based recovery. 
        This doesn’t extend to the user’s data files, but it does 
        protect the NTFS system files. This means that if the 
        system has a power failure or otherwise comes crashing 
        down, the partition will always be in a consistent state 
        and ready to offer a useful file system to the operating 
        system. Applications can also work to protect user data 
        by periodically flushing the cache to the same log file 
        the system uses.
      The transaction-based recovery process is managed by 
        the Log File Service (LFS), part of the NTFS device driver. 
        Each time an NTFS volume is mounted and then accessed 
        by an application, the partition goes through a recovery 
        process where unresolved I/O transactions are either completed 
        or rolled back to the last known consistent state, based 
        on information contained in a transaction log.
      To accomplish this, every five seconds the NTFS driver 
        writes a checkpoint record into a metadata file called 
        $Logfile, marking the entry of update records that are 
        copies of two tables of transaction information (see Figure 
        2). One is the dirty page table, which contains changes 
        to the file structure that haven’t been written to the 
        disk. The other is the transaction table, which is a record 
        of all disk transactions that are underway but haven’t 
        been completed.
      
         
          |  | 
         
          | Figure 2. The $Logfile 
            metadata file contains update records that are copies 
            of two tables of transaction information: file structure 
            changes that haven't been written to the disk, and 
            disk transactions that are underway but not complete. | 
      
      During the recovery process the LFS can either redo the 
        steps that make up a complete transaction or undo a partial 
        set of steps of an uncompleted transaction. The LFS knows 
        whether to redo or undo the transaction based on the existence 
        of a record that declares a transaction complete or, in 
        database terminology, “committed.” If there’s no record 
        declaring a transaction committed, the LFS will undo each 
        step recorded by the $Logfile in the reverse order of 
        operation to rollback the transaction. In either case, 
        the file structure will be in a consistent, usable state. 
        This process of creating transaction records in the log 
        file occurs whenever performing operations such as creating, 
        deleting, renaming, setting security permissions, or making 
        any other type of change to the file system attributes. 
      
      Next Month: Tools of the Trade
      As you can see, the NTFS environment is a busy and complex 
        place. Although the architecture of the file system is 
        designed to be efficient, the basis of the allocation 
        of disk space is still fundamentally at the cluster level. 
        Because of this, as files are deleted, expanded, and otherwise 
        altered, the MFT runs that keep track of the data attributes 
        can be scattered all over the disk in fragments that can 
        have a decided impact on I/O performance. Based on this 
        understanding of how the NTFS functions, next month I’ll 
        discuss specific tools that help manage this problem, 
        and I’ll show how they actually work with the NTFS.