Ext4


ext4 is a journaling file system for Linux, developed as the successor to ext3.
ext4 was initially a series of backward-compatible extensions to ext3, many of them originally developed by Cluster File Systems for the Lustre file system between 2003 and 2006, meant to extend storage limits and add other performance improvements. However, other Linux kernel developers opposed accepting extensions to ext3 for stability reasons, and proposed to fork the source code of ext3, rename it as ext4, and perform all the development there, without affecting existing ext3 users. This proposal was accepted, and on 28 June 2006, Theodore Ts'o, the ext3 maintainer, announced the new plan of development for ext4.
A preliminary development version of ext4 was included in version 2.6.19 of the Linux kernel. On 11 October 2008, the patches that mark ext4 as stable code were merged in the Linux 2.6.28 source code repositories, denoting the end of the development phase and recommending ext4 adoption. Kernel 2.6.28, containing the ext4 filesystem, was finally released on 25 December 2008. On 15 January 2010, Google announced that it would upgrade its storage infrastructure from ext2 to ext4. On 14 December 2010, Google also announced it would use ext4, instead of YAFFS, on Android 2.3.
Its improvements over ext3 include a date range that ends in the year 2446 instead of 2038, a timestamp accuracy of a nanosecond instead of one second, and higher size limits.

Adoption

ext4 is the default file system for many Linux distributions including Debian and Ubuntu.

Features

; Large file system
; Extents
; Backward compatibility
; Persistent pre-allocation
; Delayed allocation
; Unlimited number of subdirectories
; Journal checksums
; Metadata checksumming
; Faster file-system checking
; Multiblock allocator
; Improved timestamps
; Project quotas
; Transparent encryption
; Lazy initialization
; Write barriers

Limitations

In 2008, the principal developer of the ext3 and ext4 file systems, Theodore Ts'o, stated that although ext4 has improved features, it is not a major advancement, it uses old technology, and is a stop-gap. Ts'o believes that Btrfs is the better direction because "it offers improvements in scalability, reliability, and ease of management". Btrfs also has "a number of the same design ideas that reiser3/4 had". However, ext4 has continued to gain new features such as file encryption and metadata checksums.
The ext4 file system does not honor the "secure deletion" file attribute, which is supposed to cause overwriting of files upon deletion. A patch to implement secure deletion was proposed in 2011, but did not solve the problem of sensitive data ending up in the file-system journal.

Delayed allocation and potential data loss

Because delayed allocation changes the behavior that programmers have been relying on with ext3, the feature poses some additional risk of data loss in cases where the system crashes or loses power before all of the data has been written to disk. Due to this, ext4 in kernel versions 2.6.30 and later automatically handles these cases as ext3 does.
The typical scenario in which this might occur is a program replacing the contents of a file without forcing a write to the disk with fsync. There are two common ways of replacing the contents of a file on Unix systems:fd=open; write; close;, fd=open; write; close; rename;
Using fsync more often to reduce the risk for ext4 could lead to performance penalties on ext3 filesystems mounted with the data=ordered flag. Given that both file systems will be in use for some time, this complicates matters for end-user application developers. In response, ext4 in Linux kernels 2.6.30 and newer detect the occurrence of these common cases and force the files to be allocated immediately. For a small cost in performance, this provides semantics similar to ext3 ordered mode and increases the chance that either version of the file will survive the crash. This new behavior is enabled by default, but can be disabled with the "noauto_da_alloc" mount option.
The new patches have become part of the mainline kernel 2.6.30, but various distributions chose to backport them to 2.6.28 or 2.6.29.
These patches don't completely prevent potential data loss or help at all with new files. The only way to be safe is to write and use software that does fsync when it needs to. Performance problems can be minimized by limiting crucial disk writes that need fsync to occur less frequently.

Implementation

Linux kernel Virtual File System is a subsystem or layer inside of the Linux kernel. It is the result of an attempt to integrate multiple file systems into an orderly single structure. The key idea, which dates back to the pioneering work done by Sun Microsystems employees in 1986, is to abstract out that part of the file system that is common to all file systems and put that code in a separate layer that calls the underlying concrete file systems to actually manage the data.
All system calls related to files are directed to the Linux kernel Virtual File System for initial processing. These calls, coming from user processes, are the standard POSIX calls, such as open, read, write, lseek, etc.

Interoperability

Although designed for and primarily used with Linux, an ext4 file system can be accessed via other operating systems via interoperability tools.
Windows provides access via its Windows Subsystem for Linux technology. Specifically, the second major version, WSL 2, is the first version with ext4 support. It was first released in Windows 10 Insider Preview Build 20211. WSL 2 requires Windows 10 version 1903 or higher, with build 18362 or higher, for x64 systems, and version 2004 or higher, with build 19041 or higher, for ARM64 systems.
Paragon Software offers commercial products that provide full read/write access for ext2/3/4 Linux File Systems for Windows and extFS for Mac.
The free software ext4fuse provides limited support.

General Architecture

The ext4 filesystem divides the partition it resides into smaller chunks called blocks. By default, the block size is the same as the page size, but it can be configured with during filesystem creation. Blocks are grouped into larger chunks called block groups.

Superblock

This is the heart of the filesystem; it resides in only one block of the disk. It is usually the first item in a block group, except for group 0, where the first few bytes are reserved for the boot sector. The Superblock is vital for the filesystem – as such, backup copies are written across partitions at filesystem creation time, so it can be recovered in case of corruption.

Group Descriptor Table

GDT comes in second after superblock. GDT stores block group descriptors of each block group on the filesystem. It resides on more than one block on disk. Each GDT is 64 bytes in size. This structure is also vital for the filesystem; as such, redundant backups are stored across the filesystem.

Block Bitmap

The Block bitmap tracks the block usage status of all blocks of a block group. Each bit in the bitmap represents a block. If a block is in use, its corresponding bit will be set; otherwise, it will be unset. The location of the block bitmap is not fixed, so its position is stored in respective block group descriptors.

Inode Bitmap

Similar to the Block bitmap, the Inode bitmap's location is also not fixed; therefore, the group descriptor points to the location of the Inode bitmap. The Inode bitmap tracks usage of inodes. Each bit in the bitmap represents an inode. If an inode is in use then its corresponding bit in Inode bitmap will be set; otherwise, it will be unset.

Block Group Descriptors

Each block group is represented by its block group descriptor. It has vital information for the block group like free inodes, free blocks and the location of inode bitmap, block bitmap and the inode table of that particular block group.

Flexible block groups

Ext4 introduced flexible block groups. In, several block groups are grouped into one logical block group. Block bitmap and inode bitmap of first block group are expanded
to include the bitmap and the inode table of other block groups.