Content Analysis Data Recovery

Tuesday, 19 May 2015 00:00

font size decrease font size increase font size

Content Analysis Data Recovery

Rate this item

(0 votes)

Content analysis of the disk is one the cornerstones of modern information recovery industry. Content analysis helps to recover even the files, information about which does not appear in the file system – for example, in case of disk formatting damaged or destroyed file system, and finally in cases when files were deleted a long time ago.

Files search algorithm with content analysis

Lets review the work of content analysis algorithm based on the file recovery program from disks and flash cards formatted under FAT. Some FAT Recovery programs are able to find files of several hundreds of varieties. Below you will find the work of content analysis algorithm during disk scan.

Detection. In detection mode the algorithm scans the disk in search of familiar to the program file signatures. For example, photos in popular JPEG format always have JFIF signature, according to which one can determine the fact of presence of file on disk.
Identification. Detected signature is not all by far. Some signatures are so short that in the process of scanning different false detections may appear. Some signatures appear between different file types and some appear in the same file for several times. Additional checks are held to precisely determine the type of file found – for example, cross-checking of data taken from the file’s header and actually read out information.
Analysis. To determine the precise size of file in bytes an inquiry and analysis of its header is being held. The result – precise number of bytes indicating the size of file.
Determining the location of file on disk. Data received from file’s header helps determining its precise size and presence of file signature identifies the beginning of file. Based on the results the program detects sectors on the disk that supposedly are occupied by the file. It should be noted that these detections are based on some assumptions that are not always true.

In particular, it is assumed that the entire file is kept as a single continuous fragment that not always true due to disk fragmentation. In addition, some sectors may belong to other files – it is easy to verify this fact if the file system is present, but if file system is damaged or missing – you are only left with assumptions that all data belong to the file recovered at the moment.

Content analysis limitations

Unfortunately content analysis is not a panacea, it is rather a last hope tool. In case of serious damages caused to file system this is the only way of recovering at least part of files.

With content analysis one may recover not all data but only those that are in the data base of a particular program. For example, usually data recovery programs contain information about over 250 file formats, including most popular like DOC/DOCX, XLS/XLSX, JPEG/JPG, RAW and many other.

It should be noted that some file types cannot be recovered with content analysis in principal. In particular, encrypted files are specially created without repeating signatures. Many log files, binary formats, some data bases also do not have signatures that makes it impossible to detect them on the disk.

Another limitation is disk fragmentation. As it was shown above, content analysis can recovery only those files that are saved as a single continuous fragment. Fragmented file can be fully recovered only if a record about it in the file system is not damaged.

Last modified on Tuesday, 19 May 2015 16:44

Data Recovery Expert

Viktor S., Ph.D. (Electrical/Computer Engineering), was hired by DataRecoup, the international data recovery corporation, in 2012. Promoted to Engineering Senior Manager in 2010 and then to his current position, as C.I.O. of DataRecoup, in 2014. Responsible for the management of critical, high-priority RAID data recovery cases and the application of his expert, comprehensive knowledge in database data retrieval. He is also responsible for planning and implementing SEO/SEM and other internet-based marketing strategies. Currently, Viktor S., Ph.D., is focusing on the further development and expansion of DataRecoup’s major internet marketing campaign for their already successful proprietary software application “Data Recovery for Windows” (an application which he developed).