UBC Theses and Dissertations
Models and techniques for designing mobile system-on-chip devices Gubran, Ayub Ahmed
Mobile SoCs have become ubiquitous computing platforms, and, in recent years, they have become increasingly heterogeneous and complex. A typical SoC today includes CPUs, GPUs, image processors, video encoders/decoders, and AI engines. This dissertation addresses some of the challenges associated with SoCs in three pieces of work. The first piece of work develops a cycle-accurate model, Emerald, which provides a platform for studying system-level SoC interactions while including the impact of graphics. Our cycle-accurate infrastructure builds upon well-established tools, GPGPU-Sim and gem5, with support for graphics and GPGPU workloads, and full system simulation with Android. We present two case studies using Emerald. First, we use Emerald's full-system mode to highlight the importance of system-wide interactions by studying and analyzing memory organization and scheduling in SoCs. Second, we use Emerald's standalone mode to evaluate a dynamic mechanism for balancing the shading work assigned to GPU cores. Our dynamic mechanism speeds up frame rendering by 7.3-19% compared to static load-balancing. The second work highlights the time-variant traffic asymmetry in heterogeneous SoCs. We analyze the impact of this asymmetry on network performance and propose interleaved source injection (ISI), an interconnect topology and associated flow control mechanism to manage time-varying asymmetric network traffic. We evaluate ISI using stochastic traffic patterns and a set of traces that emulate mobile use cases with traffic from various IP blocks. We show that ISI increases saturation throughput by 80-184% for 12% increase in NoC area. In the last piece of work, we study the compression properties of framebuffer surfaces and highlight the characteristics of surfaces generated by different applications. We use our analysis to propose Dynamic Color Palettes (DCP), a hardware scheme that dynamically constructs color palettes and employs them to efficiently compress framebuffer surfaces. We evaluated DCP against a set of 124 workloads and found that DCP improves compression rates by 91% for UI and 20% for 2D applications compared to previous proposals. We also propose a hybrid scheme (HDCP) that combines DCP with a generic compression scheme. HDCP outperforms previous proposals by 161%, 124% and 83% for UI, 2D, and 3D applications, respectively.
Item Citations and Data
Attribution-NonCommercial-NoDerivatives 4.0 International