UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Design and performance evaluation of a superscalar digital signal processor Bagnordi, Hani

Abstract

Multimedia applications are compute intensive applications that often contain multiple streams of operations such as audio and video running in parallel. These characteristics make them suitable for multi-stream processing through the use of parallel processing. In the first part of the thesis we evaluate the instruction and data bandwidth requirements of typical signal processing and multimedia functions. In the second part of the thesis we compare the advantages and disadvantages of programmable and algorithm specific integrated circuit digital signal processors. In order to accommodate the high level of instruction and data bandwidth found in multimedia applications we propose a superscalar digital signal processor (DSP) that can execute multiple instructions in parallel. A second feature of the proposed DSP is scalability. Scalability is an important feature needed to accommodate future increases in the performance requirements of multimedia applications. Our proposed DSP is also compatible with existing instruction-set architectures which eliminates the need for a specialized compiler. Next, we introduce a detailed analysis of the availability and distribution of instruction-level parallelism in ten existing multimedia applications. We also discuss the relationship between instruction-level parallelism and machine parallelism. In the following part of the thesis, we discuss machine parallelism and describe the different superscalar techniques used to fetch, decode, and execute multiple instructions per cycle. These techniques include out-of-order issue with out-of- order completion, register renaming, and branch prediction. A parameterizable superscalar simulator is used to simulate ten real-world multimedia applications on five different models of parallelism. The five models represent a wide range of machine parallelism, from a bare scalar machine to an ideal superscalar machine with unlimited parallelism. Results of the instruction- level parallelism study show that the multimedia benchmarks simulated contain a high level of parallelism in their code which make them suitable candidates for multiple-issue machines. In addition, this high level of instruction parallelism proves to be evenly distributed throughout the program which helps in maintaining a good balance between available parallelism and on-chip resources. Furthermore, machine parallelism simulation results show that instruction fetching places the ultimate limit on speedup and is the most critical factor in determining overall performance. However, by using aggressive branch prediction mechanisms and out-of-order issue with out-of-order completion techniques we are able to obtain a two to four fold increase in performance compared to a single issue scalar machine. Overall results show that abundant instruction parallelism combined with adequate machine parallelism proves to be a real performance booster for multimedia applications.

Item Media

Item Citations and Data

Rights

For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.