Emerging large scale multicore architectures provide abundant resources for parallel computation. In practice, however, the speedup gained by parallelization is limited by the fraction of code that inherently needs to be executed sequentially (Amdahl’s Law). An important example is object serialization and deserialization. As any other I/O operation, they are inherently sequential and thus cannot immediately benefit from multicore technology. In this work, we study acceleration by offloading this sequential processing to a custom hardware circuit in an FPGA. The FPGA is placed in the data path between the network interface and the CPU.

First, we present an efficient FPGA implementation for C++ object deserialization which we compare with the traditional approach. In the second part of the paper we describe how to create FPGA circuits, i.e., VHDL/Verilog code for C++ object structures based on the object layout and the serialization procedure.