File system


In computing, a file system or filesystem governs file organization and access. A file system is a capability of an operating system that services the applications running on the same computer. A file system is a protocol that provides file access between networked computers.
A file system provides a data storage service that allows applications to share mass storage. Without a file system, applications could access the storage in incompatible ways that lead to resource contention, data corruption and data loss.
There are many file system designs and implementations with various structure and features and various resulting characteristics such as speed, flexibility, security, size and more.
File systems have been developed for many types of storage devices, including hard disk drives, solid-state drives, magnetic tapes and optical discs.
A portion of the computer main memory can be set up as a RAM disk that serves as a storage device for a file system. File systems such as tmpfs can store files in virtual memory.
A file system provides access to files that are either computed on request, called virtual files, or are mapping into another, backing storage.

Etymology

From and before the advent of computers, the terms file system, filing system and system for filing were used to describe methods of organizing, storing and retrieving paper documents. By 1961, the term file system was being applied to computerized filing alongside the original meaning. By 1964, it was in general use.

Architecture

A local file system's architecture can be described as layers of abstraction even though a particular file system design may not actually separate the concepts.
The logical file system layer provides relatively high-level access via an application programming interface for file operations including open, close, read and write delegating operations to lower layers. This layer manages open file table entries and per-process file descriptors. It provides file access, directory operations, security and protection.
The virtual file system, an optional layer, supports multiple concurrent instances of physical file systems, each of which is called a file system implementation.
The physical file system layer provides relatively low-level access to a storage device. It reads and writes data blocks, provides buffering and other memory management and controls placement of blocks in specific locations on the storage medium. This layer uses device drivers or channel I/O to drive the storage device.

Attributes

File names

A file name, or filename, identifies a file to consuming applications and in some cases users.
A file name is unique so that an application can refer to exactly one file for a particular name. If the file system supports directories, then generally file name uniqueness is enforced within the context of each directory. In other words, a storage can contain multiple files with the same name, but not in the same directory.
Most file systems restrict the length of a file name.
Some file systems match file names as case sensitive and others as case insensitive. For example, the names MYFILE and myfile match the same file for case insensitive, but different files for case sensitive.
Most modern file systems allow a file name to contain a wide range of characters from the Unicode character set. Some restrict characters such as those used to indicate special attributes such as a device, device type, directory prefix, file path separator, or file type.

Directories

File systems typically support organizing files into directories, also called folders, which segregate files into groups.
This may be implemented by associating the file name with an index in a table of contents or an inode in a Unix-like file system.
Directory structures may be flat, or allow hierarchies by allowing a directory to contain directories, called subdirectories.
The first file system to support arbitrary hierarchies of directories was used in the Multics operating system. The native file systems of Unix-like systems also support arbitrary directory hierarchies, as do, Apple's Hierarchical File System and its successor HFS+ in classic Mac OS, the FAT file system in MS-DOS 2.0 and later versions of MS-DOS and in Microsoft Windows, the NTFS file system in the Windows NT family of operating systems, and the ODS-2 and higher levels of the Files-11 file system in OpenVMS.

Metadata

In addition to data, the file content, a file system also manages associated metadata which may include but is not limited to:
A file system stores associated metadata separate from the content of the file.
Most file systems store the names of all the files in one directory in one place—the directory table for that directory—which is often stored like any other file.
Many file systems put only some of the metadata for a file in the directory table, and the rest of the metadata for that file in a completely separate structure, such as the inode.
Most file systems also store metadata not associated with any one particular file.
Such metadata includes information about unused regions—free space bitmap, block availability map—and information about bad sectors.
Often such information about an allocation group is stored inside the allocation group itself.
Additional attributes can be associated on file systems, such as NTFS, XFS, ext2, ext3, some versions of UFS, and HFS+, using extended file attributes. Some file systems provide for user defined attributes such as the author of the document, the character encoding of a document or the size of an image.
Some file systems allow for different data collections to be associated with one file name. These separate collections may be referred to as streams or forks. Apple has long used a forked file system on the Macintosh, and Microsoft supports streams in NTFS. Some file systems maintain multiple past revisions of a file under a single file name; the file name by itself retrieves the most recent version, while prior saved versions can be accessed using a special naming convention such as "filename;4" or "filename" to access the version four saves ago.

Storage space organization

A local file system tracks which areas of storage belong to which file and which are not being used.
When a file system creates a file, it allocates space for data. Some file systems permit or require specifying an initial space allocation and subsequent incremental allocations as the file grows.
To delete a file, the file system records that the file's space is free.
A local file system manages storage space to provide a level of reliability and efficiency. Generally, it allocates storage device space in a granular manner, usually multiple physical units. For example, in Apple DOS of the early 1980s, 256-byte sectors on 140 kilobyte floppy disk used a track/sector map.
The granular nature results in unused space, sometimes called slack space, for each file except for those that have the rare size that is a multiple of the granular allocation. For a 512-byte allocation, the average unused space is 256 bytes. For 64 KB clusters, the average unused space is 32 KB.
Generally, the allocation unit size is set when the storage is configured.
Choosing a relatively small size compared to the files stored, results in excessive access overhead.
Choosing a relatively large size results in excessive unused space.
Choosing an allocation size based on the average size of files expected to be in the storage tends to minimize unusable space.

Fragmentation

As a file system creates, modifies and deletes files, the underlying storage representation may become fragmented. Files and the unused space between files will occupy allocation blocks that are not contiguous.
A file becomes fragmented if space needed to store its content cannot be allocated in contiguous blocks. Free space becomes fragmented when files are deleted.
Fragmentation is invisible to the end user and the system still works correctly. However, this can degrade performance on some storage hardware that works better with contiguous blocks such as hard disk drives. Other hardware such as solid-state drives are not affected by fragmentation.

Access control

A file system often supports access control of data that it manages.
The intent of access control is often to prevent certain users from reading or modifying certain files.
Access control can also restrict access by program in order to ensure that data is modified in a controlled way. Examples include passwords stored in the metadata of the file or elsewhere and file permissions in the form of permission bits, access control lists, or capabilities. The need for file system utilities to be able to access the data at the media level to reorganize the structures and provide efficient backup usually means that these are only effective for polite users but are not effective against intruders.
Methods for encrypting file data are sometimes included in the file system. This is very effective since there is no need for file system utilities to know the encryption seed to effectively manage the data. The risks of relying on encryption include the fact that an attacker can copy the data and use brute force to decrypt the data. Additionally, losing the seed means losing the data.

Storage quota

Some operating systems allow a system administrator to enable disk quotas to limit a user's use of storage space.