Tuesday, May 8, 2012

The Problem(s) with File Systems

File systems have been the backbone of data storage systems since the early days of computing. I use the plural term because as everyone knows there isn't just one file system, but there are lots of them. FAT, FAT32, NTFS, Ext2, Ext3, Ext4, HFS+, ZFS, etc., are all examples of such file systems and every few years, one of the operating system vendors or someone in academia comes up with a new one.

Over the years more than 100 different file systems have joined the ranks. Some fill very niche applications, others have gained moderate market acceptance, while yet others are running on computers numbering over 100 million. Each new file system offers at least a few unique features (e.g. long file names, access control lists, extended attributes, journaling, or hard links) that set it apart from the others in the field, but all file systems are constrained by the general file system architecture.

Backward compatibility issues and the desire of application designers to write to a single, unified, file API make it extremely difficult for new file systems to introduce compelling, original features without breaking the mold. Over the years, numerous problems dealing with data storage have surfaced. Some problems have been solved or at least mitigated by the introduction of newer file systems. Other problems continue to plague data managers and require a radical new approach to solve.

In spite of numerous problems, file systems work reasonably well and their endurance is a testament to their designers. However, I believe the time has come to replace file systems with something better. By that, I don't mean we need to just build another file system that does a few things differently than the others. I mean that we need a radically new general-purpose data management system that is not limited by the conventional file system architecture.

So, what's wrong with today's file systems? Let me count the ways...

The biggest problem is that file systems don't actually "manage" files. Sure, they enable hundreds or thousands of different applications to create lots of files, but they don't actually help manage them. File systems only do what applications tell them to do and nothing more. A file system won't create, copy, move, or delete a file without an explicit command to do so from the operating system or an application.

Any application with access can create one or more files within the file system hierarchy and fill them with structured or unstructured data. With today's cheap, high-capacity storage devices, file system volumes can be created which will hold many millions of files. Some file systems are capable of storing several billions of individual files.

While a file system will make sure that every file's data stream is properly stored and kept intact and that every file's metadata fields maintain the last values set by applications, the file system itself knows almost nothing about the files it stores. Every file system treats each of its files like a "black box". A file system can't tell a photo from a document or a database from a video. It doesn't make sure that the file's unique identifier (its full path including the file name) is in any way related to the data it contains. It doesn't care a wit if an application creates a file containing a resume' and names it C:\photos\vacations\MyGame.exe. It also will let a user store music files in their /bin directory or put critical operating system files in a folder called /downloads/tmp.

What this means is that if the user, for example, wants to find all photo files that were created in 2011, the file system will do little to help find them. An application must examine each and every file within the system and compare its data type with known photo formats and then check its date and time stamps to see if it was created in that year. Unlike databases that have sophisticated query languages for lightning fast searches, file systems have things like findFirst and findNext.

When the number of files within a file system grows beyond several thousand, it becomes increasingly difficult for the average user to try and manage them using a file browser and a well defined folder structure. Once the number exceeds a million, the user is generally completely lost without special file management applications to help organize all the files. Basic searches for either a single file or for groups of files can take a very long time since directory tree traversals using string comparison functions are inherently slow. As the number of files grows, the queries take longer and longer.

To combat this problem, users are turning to special purpose data management applications to help them manage a certain subset of all their files. To manage their music files, they get iTunes. To manage their photos they try Picasa or Photoshop. To manage their video streams they install iMovie. Each of these applications offers ways to organize and keep track of their respective data sets. They often allow the user to tag or otherwise add special metadata to every file they manage to help the user classify files or put them into playlists, favorites, or albums. This extra metadata is often stored in a proprietary format or in a special database managed exclusively by the application.

This solution to managing data generally results in a collection of separate "silos of information" that do not interoperate with each other very well. Other applications are not able to easily take advantage of the extra metadata generated by the various data management applications. Many files within a given volume are not part of one of these silos and must be managed independently. Movement of data from one system to another often requires special import and export functions that don't always work. Finally, the management applications often just maintain references to the files they manage. If another application moves, renames, or deletes the underlying files, the management application often runs into problems as it tries to resolve the inconsistencies.

Operating systems like Windows 7 and Mac OS X include special file indexing solutions (Windows Search and Spotlight) to help the user find files. The indexer will comb through some or all of the files in a volume and "Index" the metadata and/or file content it can identify. It will store all the index information within special database files so that the user or applications can quickly find files based on keywords. Unfortunately, these indexers are not tied directly to the file systems they index. It is often the case (especially with portable storage devices) that changes are made to files while the indexer is not currently controlling or monitoring changes. This can happen if the user boots another operating system or plugs the portable drive into another computer. Once the indexer resumes operation, it must go through an extensive operation to try and figure out what changed. In some cases, it just deletes its index and starts over. For volumes with millions of files, it can take many hours to re-index.

Some file systems allow extended attributes to be created by applications and maintained by that file system. Unfortunately, extended attributes are not universally supported and each file system's implementation is different. Copying or moving files with extended attributes between file systems can result in the loss of information or its unexpected alteration. Even those file systems that allow extended attributes do not provide any fast way for applications to search for files based on them. Other than through the indexing services mentioned earlier, it is nearly impossible to find a set of files based on a common extended attribute value.

Another persistent problem is that every file within the system is subject to changes initiated by any application with access. The file API is very open and allows almost every piece of metadata or byte stream to be modified at will. Malicious or inadvertent changes can wreck havoc on a system. A virus that manages to run under the logged on user is able to modify any file that user has rights to. Such malicious programs could, for example, make random alterations to filenames and/or folder names and thus invalidate any stored path names. A program can change date and time stamps, file attributes, access permissions, and file locations. The file attribute "Read-only" is just a suggestion for applications to leave the data stream alone. Any program with rights can simply change the attribute to "Read-Write", modify the file contents at will, and even change the attribute back once it is finished. What this means is that no file metadata or data stream can be trusted to be either accurate or even reflect its original state. Only a bit-by-bit comparison with another set of original data can assure that any file has not been altered.

For many operations such as file backup or synchronization, a knowledge of the order of operations against a particular data set is crucial. File systems use date and time stamps to keep track of when files are created, accessed, or modified. As was previously pointed out, because each of these time stamps can be altered at will, they may not be accurate. Even if no applications alters them, the values they contain may not reflect the proper order of operations. The file system simply queries the value of the system clock controlled by the running operating system when it records date and time stamps. The clock may be off by a few minutes, hours, days, or even longer. The clock can be reset by the user or by synchronization with another computer. A portable drive that is plugged into two different computers during the course of a day, each with a clock that is different, may not record the proper sequence of events with regards to file operations.

Lastly, one of the biggest weaknesses of file systems is the unique identifier that is used for files. The file name and the folder names in its hierarchy make up each file's unique identifier. Every file must have one and only one full path name and it must be unique. Some file names are human readable, others are generated by software and may look like "RcXz12p20.rxz". The human readable names are generally in the language of the creator and cannot be translated without altering the file's unique identifier. Various file organizers and any other application that wants to keep track of one or more files, often stores the full path to the files either within a database or within another file's data stream. If the original file name is altered or any folder in its path is either renamed or the file is moved to a new folder, the stored path becomes invalid. "File Not Found" is among the most common error conditions encountered by users or applications.

Computers are much faster at crunching numbers than they are at string comparisons. It will always be much faster for a file system to find a million files if it is given their iNode numbers than it would be to find them based on a million different full path names.

As block storage devices like hard disk drives and flash memory drives continue to expand in capacity, the number of files within any given file system volume will continue to increase dramatically. As the average number of files a user or business has grows, the issues identified here will become even more problematic.

1 comment: