An Efficient Hardware Accelerator to Handle Compressed Filters and Avoid Useless Operations in CNNs
Due to sparsity, a significant percentage of the operations carried out in Convolutional Neural Networks (CNNs) contains a zero in at least one of their operands. Different approaches try to take advantage of sparsity in two different ways. On the one hand, sparse matrices can be easily compressed, saving space and memory bandwidth. On the other hand, multiplications with zero in their operands can be avoided.
We propose the implementation in an FPGA of an architecture for CNNs capable of taking advantage of both, sparsity and filter compression.