Monday, 18 May 2015 00:00

NTFS Data Structure and Recovery Internals

Rate this item
(0 votes)


The following are features of NTFS (New Technology File System):

  • Data Storage
    • NTFS uses transaction-processing model for storing and accessing data.
    • All write operations are treated as atomic.
    • The operations have ACID properties: Atomicity, Consistency, Isolation, Durability.
  • Security
    • Based on the Windows NT security model.
    • Each open file is treated as an object of the security descriptor that is stored on disk as part of the file.
    • Windows NT checks whether the process has access to the file before opening it.
  • Data Redundancy and Fault Tolerance
    • Fault tolerance provides a layered model drivers (HAL - Hardware Abstraction Layer) via Windows NT.
    • NTFS drivers communicate directly with a rigid disk controller that is fault tolerant. It also implements the mirror (mirroring, RAID 0) and parity striping (striping with parity, RAID 5).
  • Large Drives, Large Files
    • NTFS uses 64-bit addresses to address clusters thereby providing a distinction between 264 clusters each of a size up to 64kb.
    • Each file can contain up to 264 bytes of data.
  • Additional Features Include
    • Multi-Streaming
    • File names that use Unicode symbols
    • Consistent indexing of file attributes
    • Dynamic reallocation of defective data from a cluster (bad-cluster remapping, hot fix)

The internal structure

NTFS (as well as other file systems) consists of device drivers (operating procedures) operating in kernel mode

The construction of inner-schema

NTFS Driver

It works with three other components of Windows NT

  • Log file service - supports log records to disk
  • Cache manager - handles the cache for the entire file system
  • Virtual memory manager - used by the disk driver, for example. when data from a file not in cache NTFS Driver

NTFS and components

NTFS file

  • Using the Object Model Windows NT NTFS treats each file as an object
  • This allows you to manage the file using the manager objects (object manager)

The structure of NTFS

Disk structure

  • Volume - a logical disk partition, created when you format a disk or disk section applies to NTFS
  • The disk can contain one or more volumes, the volume can span multiple disks
  • The volume contains all the system data files (files, directories, ...)

The structure of disk-example


  • The basic unit of storing data allocated by the file system
  • Size of 0.5-4.0 kb
  • The cluster size is an integer multiple (always a power of 2) size of the sector

Cluster and sector

The numbering of the clusters

  • NTFS uses logical cluster number (LCN) and the virtual cluster number (VCN)
  • LCN number the clusters in the volume from beginning to end (0 to .. m)
  • VCN numbered clusters storing the file information

Parent Master file table file table (MFT)

  • NTFS, all data, including metadata, stored in files
  • MFT is implemented as an array of records. Each record describes exactly one file with a table including the MFT
  • Metafile contains information used to implement a file system structure
  • NTFS is the physical address of record of the MFT boot file when mounting the volume
  • With the MFT records are read describing metafiles

NTFS metadata file

  • MFT
  • A copy of the MFT
  • The log file (playback data)
  • The root
  • Bitmap file (as the allocation of volume)
  • The boot file (bootstrap code)
  • Cluster File corrupted
  • The file volume (volume name, version of NTFS)
  • Array of attribute definitions

NTFS metadata file

The file references

  • Addresses the NTFS file an appeal with a length of 64 bits
  • The appeal consists of a number of sequences (16 bits) and file number
  • The file number indicates the position of a record describing the file in the MFT

The file references

MFT Record

  • The NTFS file is a pair of attributes / value
  • The file attributes are stored as a separate stream of bytes within the appropriate file
  • NTFS provides basic operations for the stream attributes (create, delete, read, write)

MFT Record

Attributes resident and nonresidential

  • Resident attribute values ??are stored in MFT record
  • Nonresident attribute values ??are stored in MFT entries out of an array

Attributes nonresident

  • Entry - part of the disk space (2 or 4k) reserved by NTFS to store attributes whose values ??are too large to store in a record MFT
  • Only the attributes of increasing size can be nonresident (security descriptor, a list of attributes, ...)

Header attribute

  • Each attribute starts a header containing information about it
  • The header is the same for both types of attributes
  • The header is a resident
  • The resident attribute stores the distance from the header to the value of the attribute header and the length of the attribute value
  • Nonresident attribute header contains information about the position of the corresponding entry on the disk

The resident attribute header

Header attribute nonresident

Indexing of file names

  • NTFS uses a schema name index B-tree
  • File attributes are indexed in the catalog B-tree scheme to reduce the number of accesses
  • In large directories file names are stored in the buffers of fixed length (4 kb)

Directory attributes

  • Attribute the main index - provides the first level of B-tree and an index to the buffer containing the next level
  • Attribute allocation index - it contains VCN-LCN mapping for index buffers
  • Attribute bitmap - stores information on a virtual cluster number of free and busy in the index buffers

Root attributes

Data compression

NTFS can compress data at

  • File
  • Catalogue
  • Volume

"Compression" sparse files

  • Files contain rare in proportion to their size, small number of bytes of non-zero (sparse matrices)
  • Compression involves allocating the address space of the disk areas containing only non-zero bytes
  • Reading NTFS MFT records in search of continuous areas of address space to store non-zero data

Areas of sparse file

MFT file record of a rare

File compression standard

  • performed algorithmically
  • NTFS divides the file into units the size of the compression capacity of 16 clusters and compresses them
  • NTFS allocates for each unit of compression appropriate number of clusters (no more than 16), and writes the compressed data to them
  • Compression unit size is a compromise between compression efficiency and speed of data access in the compressed file

Areas of normal file

MFT file record normal

Recovery and data consistency

  • Data Integrity Should Be preserved without the need for additional software tools
  • It uses a method based on tracking Journaled operations
  • The action was limited to Procedures for the recovery of system files

Journals: (LFS services log file)

  • The sequence of procedures employed in kernel mode caused by the NTFS driver during access to the data
  • NTFS does not write to, or read from the log directly, but used for this purpose manager cache (Cache Manager)

Services log file

The structure of the log file

  • Area restart - contains information about the context (the state of progress of the operation)
  • Log area - includes a description of the operation

Log file strucrure

The operations log

  • Opening the log file
  • Write operations to the log records
  • Reading the records in any order (straight, backwards) from the log file
  • Removing records from the log file
  • Set the log file for the start of operations with a higher sequence number (LSN Log Sequence Number)

Log records

  • records to improve
  • test records
  • Records are identified by sequence number (LSN Log Sequence Number)

Records to improve

  • They contain two types of information
  • Redo - allows you to redo part of the operation started
  • Back - to return to the state before the commencement of the operation portion

Records undo / redo

Test records

  • Periodically written to the log file by NTFS
  • Delete the contents of the log file:
    • Cache Manager writes all unsaved data and log file on disk
    • NTFS to set a record for the current operation at the beginning of the file
  • Test records are used for determining the distance back surgery recovery


  • Carried out on two tables stored in memory
    • An array that stores information about the operation of unfinished operations in the form of a number LSN last recorded record last operation
    • Incomplete array that stores information about the pages numbered memory pages modified in the cache, and not stored on disk in the form of numbers LSN records are not saved to a file
  • Shortly before writing the test record LFS writes a copy of both arrays to the log file as a record and stores the LSN of the record number of test

Phases of recovery

  • analysis
  • redo
  • back

The analysis phase

Finding a record in a log file from which to begin the recovery process

  • At the beginning of the process, the LFS is the latest log file control record and the last copy of the array operations and array of incomplete pages.
  • NTFS is looking first at improving the control record to verify the contents of both tables
  • NTFS tables based on the content determines the oldest record LSN improved, whose operations were not performed and end operations

Recovery Phase

To update the cache memory on the data stored on disk before the failure

  • NTFS scans the log file from the record found in the analysis phase towards the end of the file
  • Search ends with finding the first record, "update page" (page update, volume modification)
  • On the basis of the information contained therein is updated information in the cache buffers

Phase back

Performed for the withdrawal of operations that were discontinued, and the data lost

  • Finding the array operations last LSN dziennikowanej record, not the operation completed
  • Doing a withdrawal (removal) of records of operations occurring in the log file record found
  • Synchronizing the cache memory and data stored on disk

Fault tolerance

Provides FtDisk driver sets (fault tolerant disk driver) running between the controller file systems and disk driver Allows:

  • Volume management (LVM equivalent)
  • The redundancy of stored data
  • Dynamic data recovery from damaged sectors of SCSI disks

The set of volumes

  • A single logical volume can consist of max. 32 areas of free address space of one or more disks
  • Used to create high-volume capacity of many small areas
  • FtDisk driver sets cover the front of the file system physical disk configuration
  • NTFS allows you to confidently increase the size of the volume

The set of volumes - an example

Interleaving (striping)

Sequence of partitions (one per disk) forming one volume

  • FtDisk driver sets distributes data across physical drives
  • It uses more physical sectors of the next strip
  • The data on different disks are recorded simultaneously
  • In a busy system will reduce the latency of I / O

Interlacing - an example

Mirrored (RAID 1)

  • Duplicate data from one disk partition to partition the same size of the second disk
  • In the event of data loss from a primary partition FtDisk driver sets automatically goes back to back

Load balancing for read operations

For read operations FtDisk driver sets is trying to impose both partitions evenly for greater data throughput

Banding with the checksum on a single disk (RAID 4)

  • As for banding, but with an extra disk to store the checksum (the most symmetric difference XOR) of the corresponding clusters
  • Sequential record checksum seriously reduces the efficiency of the system

Banding with the checksum on multiple disks (RAID 5)

As Level 4 but the checksum is stored on all disks

Management sectors

  • Recovering data from sectors showing signs of damage
  • It uses some disk capacity to create collections of clusters of damaged (SCSI drives)
  • FtDisk driver sets after receiving information from damaged sector of hard copies of the data, and the sector joins the list of bad sectors
  • In this regard FtDisk driver sets work autonomously without the intervention of a file system driver

Update mapping of clusters of damaged

  • Executed when FtDisk driver sets are not able to recover data from a cluster and make a copy or if it is not installed FtDisk driver sets
  • NTFS automatically allocates a new cluster and copies data from bad sector containing
  • NTFS recovers data from a FtDisk driver sets or the volume of excess

Faulty cluster

Mapping update

Types of access (file, directory)

  • Lack of access - by default assigned to each group. Members covered it can not even see the file name or directory.
  • List - allows you to see the directory and files, and make the current directory.
  • Read - for directories that contain executable files. For a directory, such as type list. Lets you run files and can view but not modify it.
  • Addition - allows you to add a file to a directory, but does not modify it.
  • Adding and reading - you can add a file to a directory, execute the file and read its contents.
  • Change - the level assigned to each user by default. Allows operations resulting from the addition and read and modify and delete files from the directory.
  • Full Control - Administrators and Power Users gain complete control over the files and directories from a local disk. How to change permissions and delete the whole directory, granting permission for him and take ownership of the directory.
  • Special access to directories - allows you to define a custom set of permissions and granting their users to the directory.
  • Special access to files - to define a custom set of permissions to users and granting them to the file.
  • Read (R) - the user can read the contents of a file or directory without modification.
  • Write (W) - the user can modify the contents of a file or directory, but can not read it. Allows you to create files and directories.
  • Execute (X) - the user can execute files (if they are executable). It is also necessary to open the folder to view its contents.
  • Removal of (D) - user can delete a file or directory.
  • Change Permissions (P) - the right to broadcast the administrator typically, you can change other users permissions to the file or directory.
  • Take Ownership (O) - user can take ownership of the directory.

Data retrieval Colorado Springs, CO provides you with the safest and reliable data recovery service. Read more ->

Last modified on Monday, 18 May 2015 18:50
Data Recovery Expert

Viktor S., Ph.D. (Electrical/Computer Engineering), was hired by DataRecoup, the international data recovery corporation, in 2012. Promoted to Engineering Senior Manager in 2010 and then to his current position, as C.I.O. of DataRecoup, in 2014. Responsible for the management of critical, high-priority RAID data recovery cases and the application of his expert, comprehensive knowledge in database data retrieval. He is also responsible for planning and implementing SEO/SEM and other internet-based marketing strategies. Currently, Viktor S., Ph.D., is focusing on the further development and expansion of DataRecoup’s major internet marketing campaign for their already successful proprietary software application “Data Recovery for Windows” (an application which he developed).

Leave a comment

Make sure you enter the (*) required information where indicated. HTML code is not allowed.