Understanding Unix Filesystems02/28/2001
In last week's article, we viewed a PC's BIOS partition table and its Unix partition table using the
disklabel utilities. Let's continue this week by looking at the
newfs utility and
inode tables. The
newfs utility actually formats your slice with the filesystems you previously specified with the
disklabel utility. Let's start by taking a closer look at formatting and filesystems in general so we can gain a better appreciation of
There are two types of hard-drive formatting. When you purchased your hard drive, it most likely was already "low level" formatted for you by the manufacturer. Low-level formatting creates the tracks and sectors on the drive; the intersection of these tracks and sectors creates the units of storage known as physical blocks, which are 512 bytes in size.
The second type of formatting is called "high-level" formatting. This type of formatting installs a particular file system onto a slice of your physical drive using a utility such as DOS's
format or FreeBSD's
newfs. Some examples of file systems are FAT16, FAT32, NTFS, and FFS. Different file systems may vary in performance, but they usually have two features in common:
They require some type of table to map block addresses to the files contained within the blocks
They may also use a "logical" block addressing scheme to try to optimize read/write performance
Let's pretend you're a file system for a moment. Your goal is to quickly store (write) and find (read) data given the following physical limitations:
- You've been assigned an area called a "cylinder."
- Your cylinder has 255 horizontal lines running through it (tracks) and also 63 vertical lines running through it (sectors).
- Where these lines intersect, a storage unit (block) has been created for you to place files into; every block is the same size (512 bytes).
So, how many storage blocks do you have on your cylinder?
255 * 63 = 16,065
If you put one file in each storage block, you can save up to 16,065 files. If you create a table for yourself and number it from 1 to 16,065, you can simply record the name of each file next to a free number as you save the file to the block represented by that number. If you delete a file, you have to remember to remove its name from your table. If you want to move a file, you can look it up in your table, erase it from the old location, and write its name next to its new block number. You would also quickly learn that you didn't have to go to all the trouble of physically moving a file from one storage block to another storage block; it is much easier to simply change the entry for that file in your table.
In its simplest form, this is how all file systems keep track of your files. If this was a Unix file system, that table would be called the inode table.
Unfortunately, simplicity results in lousy hard disk usage. The ability to save files in 512-byte storage units would be great if every file created by users was 512 bytes in size. But, as you know, files vary greatly in size, from just a few bytes to several kilobytes.
Still thinking as a file system, how would you save a file that was 10 bytes in size? If you simply place this file by itself into a storage block, you've wasted 412 bytes of hard disk space. Save enough small files, and you end up wasting a lot of your disk space. What would happen if you saved 16,065 one-byte files? You would use up all your blocks with 16,065 bytes worth of data. Even though there may be several MB of disk space on your cylinder, you have run out of the blocks to place files into. This is called running out of inodes (or inode table entries), and it is not a good thing.
Continuing to think like a filesystem, it would make sense not to devote an entire physical block to one small file. However, you now have to re-think how you're going to organize your table to deal with the fact that there may be more than one file in a block. If you simply start stuffing in as many files as will fit into a 512-byte block, how are you going to keep track of where one file ends and another file begins? What if you remove a 10-byte file and replace it with an 8-byte file -- how will you keep track of that extra two bytes in case you want to stuff in two more 1-byte files? You should be able to see that such a scheme would quickly become unworkable.
Most filesystems use the concept of "fragments." A fragment is a logical division of a block. Each fragment will be assigned an address so it can have an entry in the filesystem table. As a simple example, a filesystem may choose to divide each physical block into four "fragments." This effectively multiplies your number of blocks by four while reducing the block size by four. For example, if you started with 16,065 physical blocks that were each 512 bytes in size, a fragment size of 4 would give you 64,260 logical blocks that were each 128 bytes in size. Each fragment can be treated as a storage block and only store one file, meaning the table now has 64,260 entries, but you still don't have to worry about about keeping track of multiple files per logical storage unit.
Now let's look at the other end of the scale. What happens if you need to store files larger than your physical or logical block size? You're obviously going to need to use more than one block to store that file. Pretend you need to save a 1,000-byte file. This will require two physical blocks (512 * 2), so you will need to make two entries in your table for this one file. You'll also have to re-think how you are going to make those entries, as order is now important. It's not enough to know that this file lives in, say, blocks 3 and 4; you also need to know that the first 512 bytes of that file lives in block 3, and the remainder of that file lives in block 4.
If you are a filesystem that chose to use fragments, your job is actually harder when you need to save a large file. If you save that same 1,000-byte file with a fragment size of 4, you'll have to make eight entries in your table and ensure that you remember which order to keep those eight entries in.
Up to this point, we've only looked at the considerations for saving or writing files. Another important consideration for a filesystem is its read performance. The whole point of saving files to disk in the first place is so that users can access files when they need to. In order for a user to access a file, the filesystem must find out which block or blocks that file has been stored in, and then copy the contents of those disk blocks to RAM so the user can actually manipulate the data within the file.
Pages: 1, 2