Karatsuba Algorithm Revisited for 2D Convolution Computation Optimization

UBC Faculty Research and Publications

Karatsuba Algorithm Revisited for 2D Convolution Computation Optimization Wang, Qi; Zhu, Jianghan; He, Can; Wang, Shihang; Wang, Xingbo; Ren, Yuan; Ye, Terry Tao

Abstract

Convolution plays a significant role in many scientific and technological computations, such as artificial intelligence and signal processing. Convolutional computations consist of many dot-product operations (multiplication–accumulation, or MAC), for which the Winograd algorithm is currently the most widely used method to reduce the number of MACs. The Karatsuba algorithm, since its introduction in the 1960s, has been traditionally used as a fast arithmetic method to perform multiplication between large-bit-width operands. It had not been exploited to accelerate 2D convolution computations before. In this paper, we revisited the Karatsuba algorithm and exploited it to reduce the number of MACs in 2D convolutions. The matrices are first segmented into tiles in a divide-and-conquer method, and the resulting submatrices are overlapped to construct the final output matrix. Our analysis and benchmarks have shown that for convolution operations of the same dimensions, the Karatsuba algorithm requires the same number of multiplications but fewer additions as compared with the Winograd algorithm. A pseudocode implementation is also provided to demonstrate the complexity reduction in Karatsuba-based convolution. FPGA implementation of Karatsuba-based convolution also achieves 33.6% LUTs (Look -up Tables) reduction compared with Winograd-based implementation.

Item Metadata

Title	Karatsuba Algorithm Revisited for 2D Convolution Computation Optimization
Creator	Wang, Qi; Zhu, Jianghan; He, Can; Wang, Shihang; Wang, Xingbo; Ren, Yuan; Ye, Terry Tao
Publisher	Multidisciplinary Digital Publishing Institute
Date Issued	2025-05-08
Description	Convolution plays a significant role in many scientific and technological computations, such as artificial intelligence and signal processing. Convolutional computations consist of many dot-product operations (multiplication–accumulation, or MAC), for which the Winograd algorithm is currently the most widely used method to reduce the number of MACs. The Karatsuba algorithm, since its introduction in the 1960s, has been traditionally used as a fast arithmetic method to perform multiplication between large-bit-width operands. It had not been exploited to accelerate 2D convolution computations before. In this paper, we revisited the Karatsuba algorithm and exploited it to reduce the number of MACs in 2D convolutions. The matrices are first segmented into tiles in a divide-and-conquer method, and the resulting submatrices are overlapped to construct the final output matrix. Our analysis and benchmarks have shown that for convolution operations of the same dimensions, the Karatsuba algorithm requires the same number of multiplications but fewer additions as compared with the Winograd algorithm. A pseudocode implementation is also provided to demonstrate the complexity reduction in Karatsuba-based convolution. FPGA implementation of Karatsuba-based convolution also achieves 33.6% LUTs (Look -up Tables) reduction compared with Winograd-based implementation.
Subject	Karatsuba algorithm; convolutional computing complexity; hardware computation acceleration; Winograd algorithm; hardware/software co-design
Genre	Article
Type	Text
Language	eng
Date Available	2025-05-29
Provider	Vancouver : University of British Columbia Library
Rights	CC BY 4.0
DOI	10.14288/1.0448989
URI	http://hdl.handle.net/2429/91234
Affiliation	Applied Science, Faculty of; Non UBC; Electrical and Computer Engineering, Department of
Citation	Entropy 27 (5): 506 (2025)
Publisher DOI	10.3390/e27050506
Peer Review Status	Reviewed
Scholarly Level	Faculty; Researcher
Rights URI	https://creativecommons.org/licenses/by/4.0/
Aggregated Source Repository	DSpace

Open Collections

UBC Faculty Research and Publications