Been trying to catch up on some of my LWN reading (I’m weeks behind at the moment) but have stumbled upon one of those gems of information that LWN has so often – a report on the 2006 Linux File Systems Workshop.
The first page gives a useful introduction into how disk technologies are advancing and the problems that massively increasing capacity versus slowly increasing seek times create for filesystem developers. For instance:
In summary, over the next 7 years, disk capacity will increase by 16 times, while disk bandwidth will increase only 5 times, and seek time will barely budge! Today it takes a theoretical minimum 4,000 seconds, or about 1 hour to read an entire disk sequentially (in reality, it’s longer due to a variety of factors). In 2013, it will take a minimum of 12,800 seconds, or about 3.5 hours, to read an entire disk – an increase of 3 times. Random I/O workloads are even worse, since seek times are nearly flat. A workload that reads, e.g., 10% of the disk non-sequentially will take much longer on our 8TB 2013-era disk than it did on our 500GB 2006-era disk.
The second page reports on the first day of the workshop which covered hardware, errors and recovery and current filesystem design. If you are interested in filesystems or are just curious about how they work and how they can break then please go read it, it’s an outstanding article! Also read the comments, there’s some interesting stuff there too.
The last page then gets into new ideas for filesystem techniques that are designed around fixing the problems that were identified in the first day. This is nicely summarised by the comment:
These goals can be summarized as “repair-driven file system design” – designing our file system to be quickly and easily repaired from the beginning, rather than bolting it on afterward.
Very encouraging. It goes on to describe a number of different filesystem concepts that could be incorporated into new filesystems, including one (chunkfs) that could be pretty much a new filesystem in its own right.