One of the interesting and non-obvious problems of a packet capture system is engineering the underlying filesystem to store all data. You might think writing to a standard EXT4 / XFS / JFS / pick your favorite file system is enough. However this is a huge huge red flag, as there is no guarantee on the sustained write performance, which can result in packet loss.
Sequential Write Performance
The problem of using standard filesystems is, you have no control over the physical location of blocks on the disk. Instead the linux filesystem is responsible for mapping linear blocks to physical sectors on the disk. This ultimately leads to mass fragmentation that collects and grows like mold, and needs to be regularly defraged which in many cases is impossible. Eventually the fragmentation becomes so great your disk writes are scattered across the drive, and that high sequential write performance you benchmarked on a brand new file system, is now long gone.... been there, done that, it frustrating and sucks.
To get some idea of how blocks are actually mapped on disk, a group of Japanese researchers studied the access patterns of all the major Linux file systems. Their research is not focused on sustained seq write performance but they do show the physical access patterns for different file system on a fresh Ubuntu install.
Full Paper is here: https://www.kernel.org/doc/ols/2011/ols2011-suzaki.pdf
The above access pattern is for EXT3 file systems. Its scattered all over the place.
The above access pattern is for EXT4 file systems. The allocation is clearly far better and significant improvement over EXT3
For JFS file systems, its a some what bizarre scatter bomb allocator.
And the champion XFS file systems. This is the best allocation scheme out of all the standard file systems. If your doing DIY packet capture XFS is probably your best bet.
And finally, the above access pattern is for Reiser file systems. It appears to be maximizing the number of disk seeks, to minimize the performance! To be fair... its designed for maximum space efficiency for small files.
Sequential Write Performance
If the above hasn`t scared you enough about the dangers of using a standard file system, perhaps the raw sustained write throughput of non sequential write access will. (Data from our R&D work-in-progress 100Gbps packet capture system)
The graph below is testing the raw hardware performance by writing a single 1TB file. However to replicate a fragmented file system the write head seeks (jumps) every 1MB to 1TB of written data (X axis). The Y Axis is the final 1TB write throughput in Gbps. This shows maintaining sequential disk access is critical to guarantee sustained high performance.
In the chart above on the very lefthand side, we see barely 10Gbps throughput for extremely fragmented systems. It then then continues in a linear ramp to just under 70Gbps sustained throughput as the writes become less and less fragmented. After that it wobbles around and finally hits just under 90Gbps sustained throughput for a 1TB write with 0% fragmentation.
File System for Packet Capture
For packet capture we are blessed with extremely simple requirements. Data is always written sequentially across the entire disk in a simple First-In-First-Out allocation system. There`s 0% fragmentation, 100% Sequential write access with no complex directory structure.
The above shows how our captures are allocated on the disk, its a very simple linear layout with a 100% guaranteed sequential access pattern. The only interesting point is when the storage eventually fills up (shown below).
When the storage system fills up our file system simply overwrites the oldest packet data, shown above. The newly captured data (Blue Arrow) overwrites the oldest data on the system (purple arrow). Eventually "CaptureA" will have no data and is fully removed from the system.
Packet capture storage performance is entirely based on:
- a) raw hardware Sequential Write performance
- b) the applications disk access pattern.
Our 100Gbps, 40Gbps, 20Gbps, 10Gbps Packet capture system`s write directly to the raw hardware using a customized file system that guarantee`s sequential access in all situations. Coupled with excellent raw hardware performance this enables full 10Gbps to 100Gbps line rate packet capture to disk across the lifetime of the device, with no fragmentation.
Happy packet capturing!