Auto Commit Memory: Cutting Latency by Eliminating Block I/O
Last week’s demonstration of one billion IOPS showcased a new paradigm for storing data through Fusion-io Auto-Commit Memory (ACM). ACM isn’t just about making NAND Flash storage devices go faster, although it does that too. It’s about introducing a much simpler and faster way for an application to guarantee data persistence.
When Simplicity Meets Speed
For decades, the industry norm for persisting data has been the same – an application manipulates data in memory, and when ready to persist the data, packages the data (sometimes called a transaction) for storage. At this point, the application asks the operating system to route the transaction through the kernel block I/O layer. The kernel block layer was built to work with traditional disks. In order to minimize the effect of slow rotational-disk seek times, application I/O is packaged into blocks with sizes matching hard disk sector sizes and sequenced for delivery to the disk drive. As Linux block maintainer (and Fusion-io chief architect) Jens Axboe points out, most real-world I/O patterns are dominated by small, random I/O requests, but are force-fit into 4k block I/Os sequenced by the block layer to match the characteristics of rotating disks. Note the number of steps in this pathway – each one contributes to latency. Even more steps are introduced in this pathway when the block storage device is at the other end of a network, behind various bus adaptors, and controllers. As long as memory is volatile, this type of I/O pathway will be the norm.
But, what if an application could designate a certain area within its memory space as persistent, and know that data in this area would maintain its state across system reboots? This application would no longer have the burden of following the multi-step block I/O pathway to persist that data. It would no longer need smaller transactions to be packaged into 4k blocks for storage. It would just place selected data meant for persistence in this designated memory area, and then continue using it through normal memory access semantics. If the application or system experienced a failure, the restarted application would find its data persisted in non-volatile memory exactly where it was left. To illustrate, how much faster could real-world databases go if the in-memory tail of their transaction logs had guaranteed persistence without waiting on synchronous block I/O? How much faster could real-world key-value stores (typical in NoSQL databases) go if their indices could be updated in non-volatile memory and not block while waiting on kernel I/O? That is the simplicity of Auto Commit Memory. Itreduces latency by eliminating entire steps in the persistence pathway.
Addressing Both Halves of the Problem
Block storage benchmarks such as throughput and IOPS are certainly important, but only address half of the problem. The other half of the problem is the work the application and kernel I/O subsystems must do to package and route data for storage. Applications can be accelerated by addressing either or both halves of this problem. However, note that, at some point, the overhead incurred by packaging and routing data through the kernel block storage subsystem will become the bottleneck. Breaking through that barrier was the goal of this technology demonstration. Give applications the software mechanisms to avoid this block storage packaging and routing latency and complexity. Let them spend more time processing data in memory, and less time packaging and waiting for that data to arrive at a block storage destination.
Fusion-io does indeed make very fast block I/O devices. What’s most exciting about last week’s demonstration for us is looking beyond fast block I/O devices to show what is possible when you address this this fast device as memory rather than block storage.