Monday, September 13, 2010

All you wanted to know about ext2 filesystem and its troubleshooting:

Evolution of ext2 Filesystem:
The standard On-disk Filesystem used by Linux is called the ext2 filesystem or ext2fs, for historical reasons. Linux was originally programmed with a Minix compatible filesystem, to ease exchanging data with Minix development system, but that file system was severely restricted by 14 Character filename limits and the maximum filesystem size of 64MB. The Minix filesystem was superseded by a new filesystem which was christened the extended filesystem (extfs). It was designed by Rémy Card and was implemented in April 1992 as the first file system created specifically for the Linux operating system. It has metadata structure inspired by the traditional Unix File System (UFS) and was designed to overcome certain limitations of the Minix file system.
Rémy Card later redesigned this filesystem to improve performance and scalability and to add a few missing features led to the creation of second extended filesystem (ext2fs). It was introduced in January 1993. A contender of ext2 was xiafs created by Frank Xia and was based on Minix filesystem. Initially, Xiafs was more powerful and more stable than Ext2, but being a fairly minimalistic modification of the Minix file system, it was not very well suited for future extension. The end result was that Xiafs changed very little while ext2 evolved considerably, rapidly improving stability, performance and adding extensions. ext2, after some shakedown time, quickly became the standard file system of Linux. Since then, ext2 has developed into a very mature and robust file system.
ext2 was the default filesystem in several Linux distributions, including Debian and Red Hat Linux, until supplanted more recently by ext3, which is almost completely compatible with ext2 and is a journaling file system. ext2 is still the filesystem of choice for flash-based storage media (such as SD cards, and USB flash drives) since its lack of a journal minimizes the number of writes and flash devices have only a limited number of write cycles.

ext2 Disk Data Structures:
A file system usually comprises of blocks of data. We have two kinds of blocks: Physical and Logical. Physical blocks are those that reside on the actual storage media, where the data is kept and has a fixed size. Logical blocks, on the other hand, are those whose size is specified once the filesystem is created. Logical blocks are further divided into smaller logical units called as fragments. A logical block consists of an integral number of fragments. This logical block size need not be the same as the physical block size. It is the job of the file system driver to provide the mapping from the logical block size to the physical block size. A single logical block is divided into an integral number of physical blocks. Ext2 has a default logical file size of 4k.


The filesystem volume starts with block 0, which is called the ‘boot’ block, which is available on the ext2 filesystem too. This first block in each ext2 partition is never managed by the ext2 Filesystem, because it is reserved for the partition’s boot sector, which might be used to store the boot loader of the Linux Operating System. This Boot Loader can either be loaded by a small program in the MBR or may be selected by a user when he/she is presented with multiple options operating systems available on the HDD by the bootloader of another Operating System.
The blocks on disk are divided into groups. Each of these groups duplicates critical information of the file system. Moreover, the presence of block groups on disk allows the use of efficient disk allocation algorithms.

Each group contains the following:

The Super Block:
In the ext2 filesystem, superblock is the area which can be accessed by the superuser only. The superuser is that area of the filesystem, which stores information about the number of free blocks, free inodes, logical block size, the number of times the volume has been mounted, and other accounting information about the filesystem.
The Superblock contains a description of the basic size and shape of this file system. The information within it allows the file system manager to use and maintain the file system. Usually only the Superblock in Block Group 0 is read when the file system is mounted but each Block Group contains a duplicate copy in case of file system corruption. Amongst other information it holds the:
Magic Number: This allows the mounting software to check that this is indeed the Superblock for an EXT2 file system. For the current version of EXT2 this is 0xEF53.
Revision Level: The major and minor revision levels allow the mounting code to determine whether or not this file system supports features that are only available in particular revisions of the file system. There are also feature compatibility fields which help the mounting code to determine which new features can safely be used on this file system,
Mount Count and Maximum Mount Count: Together these allow the system to determine if the file system should be fully checked. The mount count is incremented each time the file system is mounted and when it equals the maximum mount count the warning message "maximal mount count reached, running e2fsck is recommended" is displayed,
Block Group Number: The Block Group number that holds this copy of the Superblock,
Block Size: The size of the block for this file system in bytes, for example 1024 bytes,
Blocks per Group: The number of blocks in a group. Like the block size this is fixed when the file system is created,
Free Blocks: The number of free blocks in the file system,
Free Inodes: The number of free Inodes in the file system,
First Inode: This is the inode number of the first inode in the file system. The first inode in an EXT2 root file system would be the directory entry for the '/' directory.

Group Descriptors:
Immediately following the Super block on the disk are the group descriptors. They hold critical information about the groups in the filesystem. Each group descriptor describes one group.
Each Block Group has a data structure describing it. Like the Superblock, all the group descriptors for all of the Block Groups are duplicated in each Block Group in case of file system corruption.
Each Group Descriptor contains the following information:
Blocks Bitmap: The block number of the block allocation bitmap for this Block Group. This is used during block allocation and deallocation,
Inode Bitmap:The block number of the inode allocation bitmap for this Block Group. This is used during inode allocation and deallocation,
Inode Table:The block number of the starting block for the inode table for this Block Group. Each inode is represented by the EXT2 inode data structure described below.
Free blocks count, Free Inodes count, Used directory count
The group descriptors are placed on after another and together they make the group descriptor table. Each Blocks Group contains the entire table of group descriptors after its copy of the Superblock. Only the first copy (in Block Group 0) is actually used by the EXT2 file system. The other copies are there, like the copies of the Superblock, in case the main copy is corrupted.

Block Bitmap of the Group: In order to account for the usage of the blocks on the filesystem, the ext2 filesystem consists of a block bitmap. This keeps track of blocks that have been used and those that are free. Each bit in the Block Bitmap denotes an integral number of fragments. So if a bit is allocated to a file and marked as used, then an entire set of fragments are allocated to it.
The Block Bitmap is a clever way to keep track of new empty and old used ones. In order to look for a block, one needs to check the group to which the file belongs. Then the Block Bitmap of the appropriate `Group’ is selected and searched for the required block.
Inode Bitmap of the Group: Similar to the blocks, the inodes assigned to the various files need to be taken into account. The inode stores all information about a file. A fixed number of blocks are usually allocated for storing the Inode Table, which stores all the file inodes. The inodes also contain pointers to disk blocks where the actual files are stored. The Inode bitmap is provided to check for allocated blocks and free them when files are deleted from the system.
Inode Table of the group: Inode Table contains an array of records (inodes), which are basically arrays of structures, containing Meta information about the file. This information includes the filename, the file size, a pointer to the disk blocks containing the file, the file creation, access and modification times, the number of links to the file, as well as the user and group ids of the file. The structures of the pointers to disk blocks are arranged in such a manner that they can be easily accessed.
The inode contains 15 pointers to blocks. Of these pointers, the first 12 are direct pointers to data. The following entry is indirect pointers to data. The next one points to `a block of pointers’ to `blocks of pointers’ to `data’ (double indirect). And another entry points to a `block of pointers’ to `a block of pointers’ to `a block of pointers’ to `data’ (triple indirect). After these data structures follow the disk blocks on which the actual blocks of data are stored.



In the EXT2 file system, the inode is the basic building block; every file and directory in the file system is described by one and only one inode. The EXT2 inodes for each Block Group are kept in the inode table together with a bitmap that allows the system to keep track of allocated and unallocated inodes. Figure 9.2 shows the format of an EXT2 inode, amongst other information, it contains the following fields:
Mode: This holds two pieces of information; what this inode describes and the permissions that users have to it. For EXT2, an inode can describe one of file, directory, symbolic link, block device, character device or FIFO.
Owner Information: The user and group identifiers of the owners of this file or directory. This allows the file system to correctly allow the right sort of accesses,
Size: The size of the file in bytes,
Timestamps: The time that the inode was created and the last time that it was modified,
Datablocks: Pointers to the blocks that contain the data that this inode is describing. The first twelve are pointers to the physical blocks containing the data described by this inode and the last three pointers contain more and more levels of indirection. For example, the double indirect blocks pointer points at a block of pointers to blocks of pointers to data blocks. This means that files less than or equal to twelve data blocks in length are more quickly accessed than larger files.
It should be should noted that EXT2 inodes can describe special device files. These are not real files but handles that programs can use to access devices. All of the device files in /dev are there to allow programs to access Linux's devices. For example the mount program takes as an argument the device file that it wishes to mount.

Directories:
A `directory, is used for logical organization and security of files on a file system. It is a standard feature of most file systems. On an ext2-based system, a directory has a specific structure. This structure contains the length of the record, the inode number associated with it, as well as the name and the length of the filename. The first two entries are ‘.’ and ‘..’ respectively for every directory.

Easy operations:
It is interesting to note how some of these features make common operations on a filesystem a lot easier. Take the example of moving a file or a directory to another directory. Here we merely change the inode number for the ‘..’ entry in the directory, and this accomplishes our ‘move’ operation. Creating a symlink to a file simply requires a change in the inode entry of the newly created ‘symlink’ file.

Access Control Lists:
The Ext2 filesystem provides support for Access Control Lists (ACL), although they are not fully used. ACLs basically consist of a list of resources for which users of a system give requests for access. Accesses are of three types: Read, Write and Execute. Control Lists store the permissions for each resource for each user of those resources. Any access to the resources is granted only after checking with the Access Control List. Once permitted, the ACL provides mutual exclusion for resources, although it may allow read/share access depending on the policies.

Troubleshooting the ext2 Filesystem:

A sector may contain many Bad Blocks. When one or more of these blocks get irrevocably corrupted, e.g. for physical reasons such as decaying of disk surface, mechanical shock, etc., the entire sector gets marked as ‘bad’ and the inaccessible blocks get remapped onto spare blocks provided by the disk itself. This is very common and some utilities like smartd can provide us with no. of bad sectors present on our disk. As a general rule the disk itself can take care of these by remapping the block to free spare sectors.
In order to force check for Bad blocks, we can use the badblocks command:

#badblocks -b 4096 -sv n /dev/sda1 –o badblocks.out

This command checks the filesystem for bad sectors/blocks and if it finds any, it will store their information into the file badblocks.out. Note that it is necessary to specify the blocksize of the filesystem in order to get the precise idea of which blocks are marked as bad. We can use this file later to perform a filesystem check:

#e2fsck –l badblocks.out /dev/sda1

In this way the command e2fsck will use the bad blocks on the list to perform its operation. However it is much more convenient and safe to run e2fsck with a –cc option (twice the –c option means it does a safe non-destructive check for Bad Blocks).

# e2fsck –cc /dev/sda1

Sometimes it may be necessary to perform very delicate operations on files or directories. Examples include salvaging what is still recoverable from the curroupted filesystem or working at a lower level with bloclks and inodes. The debugfs utility is used for this purpose.
The debugfs utility may be used for undeleting accidently removed files. However this will work only on an ext2 filesystem, and not on ext3. Infact when a file is deleted from an ext3 filesystem, not only does the inode gets unlinked, from the associated filename, but the block pointer gets zeroed. As a result the data will still be on the disk but without anything pointing to it. Without using third party application such as foremost, which scans the entire block space in search of certain patterns, it is unlikely that you can recover any data once you remove a file in the ext3 filesystem.
It is often said that an ext3 filesystem is simply an ext2 filesystem with a journal, we now know that this is not the only significant difference.

When attempting to recover a deleted file from an ext2 filesystem, follow these steps:
1. Unmount the filesystem:

#umount /dev/sda1

2. Open the filesystem in read/write mode for debugging purposes:

#debugfs –w /dev/sda1

This will return the debugfs command prompt.

3. Use the lsdel command to determine if any deleted inodes exist. Save the data into a file so that we can recover the associated data, and then exit out of debugfs:

debugfs: lsdel

Inode Owner Mode Size Blocks Time deleted
6044 0 100644 5 1 1 Mon Feb 18 16:30:00 2010

debugfs: dump 6044/root/output.file

debugfs: quit

4. Analyse the resulting file:

#file /root/output.file

output.file OpenDocument Spreadsheet

The debugfs command has many uses. For Example, you can determine which inode is associated with a filesystem block:

debugfs: icheck 48091
Block Inode number
48091 10085

Once you know the inode number, you can determine the filename associated with the inode:

debugfs: ncheck 10085
Inode Pathname
10085 /vmlinuz-2.6.18-53.1.6.el5

No comments:

Post a Comment