Throughput-Optimized Processor Microarchitecture: Coordinating Core, NoC, and Memory Subsystems
DOI: 10.23977/acss.2025.090402 | Downloads: 0 | Views: 38
Author(s)
Yuemu Fei 1
Affiliation(s)
1 Sunmmio Technology (Beijing) Co., Ltd., Beijing, 100080, China
Corresponding Author
Yuemu FeiABSTRACT
The exponential growth of data-intensive workloads, from deep learning inference to high-performance computing (HPC) simulations, has driven a paradigm shift in processor design-prioritizing throughput over single-threaded latency. However, maximizing system throughput requires more than just increasing computational density; it demands seamless coordination between three critical subsystems: processing cores, Network-on-Chip (NoC) interconnects, and memory hierarchies. This paper presents a comprehensive analysis of throughput-optimized microarchitecture design, focusing on the interdependencies and coordination mechanisms that eliminate bottlenecks across these subsystems. We first examine the architectural principles guiding each component's design for throughput, including parallel core arrays, low-latency NoC topologies, and memory-centric optimizations like Processing-in-Memory (PIM). Through a detailed exploration of coordination strategies-such as static scheduling, resource partitioning, and cross-subsystem awareness-we demonstrate how unifying these subsystems can mitigate data movement overheads, the primary limiter of modern processor efficiency. Case studies of state-of-the-art architectures (e.g., Groq Tensor Streaming Processor, TOP-PIM) validate the impact of coordinated design, showing up to 85% reduction in Energy-Delay Product (EDP) and 4x throughput improvement for parallel workloads compared to disjointed designs. Finally, we outline future research directions, including heterogeneous subsystem integration and AI-driven dynamic coordination, to address emerging challenges in extreme-scale computing. This work underscores that true throughput optimization is a system-level problem, requiring holistic design across cores, NoCs, and memory to unlock the full potential of next-generation processors.
KEYWORDS
Exponential Growth; Microarchitecture; Coordinated DesignCITE THIS PAPER
Yuemu Fei, Throughput-Optimized Processor Microarchitecture: Coordinating Core, NoC, and Memory Subsystems. Advances in Computer, Signals and Systems (2025) Vol. 9: 10-17. DOI: http://dx.doi.org/10.23977/acss.2025.090402.
REFERENCES
[1] Bell, G., & Gray, J. (2002). Crays, Clusters, and Centers: Trends in High-Performance Computing. Communications of the ACM, 45(11), 31-33.
[2] Casse, H., & Puaut, I. (2018). Reconciling Performance and Predictability on a Many-Core Through Off-Line Mapping. Proceedings of the International Conference on Embedded Computer Systems (SAMOS), 1-8.
[3] Hennessy, J. L., & Patterson, D. A. (2017). Computer Architecture: A Quantitative Approach (6th ed.). Morgan Kaufmann.
[4] Kim, S., & Park, J. (2023). Memory Sub-system Optimization for Throughput-Oriented Processors. Scalable Architecture Lab Research Report, Seoul National University, SAL-2023-05.
[5] Lee, J., & Kim, J. (2019). High-Performance Memory Hierarchy Design for Throughput Processors. Hong Kong University of Science and Technology ECE Seminar Series, May 6.
[6] Luis, A., Xu, L., & Davis, A. (2020). TOP-PIM: Throughput-Oriented Programmable Processing in Memory. IEEE Transactions on Parallel and Distributed Systems, 31(12), 2874-2888.
[7] Ousterhout, J., & Hill, M. (2021). The Groq Tensor Streaming Processor (TSP) and the Value of Deterministic Instruction Execution. Center for Computation & Technology Technical Report, LSU-CCT-TR-2021-001.
[8] Owens, J. D., Houston, M., Luebke, D., Green, S., Stone, J. E., & Phillips, J. C. (2008). Understanding Throughput-Oriented Architectures. Communications of the ACM, 51(1), 98-107.
[9] Vaidya, A., & Hill, M. D. (2018). Energy-Efficient Memory Hierarchies for Throughput Processors. IEEE Micro, 38(3), 42-51.
[10] Wang, H., & Li, J. (2023). Multi-Channel Parallel Design Strategies: Five Hardware Optimization Schemes for Improving TSI578 Throughput Performance. CSDN Library Technical Reports, 15(3), 45-62.
| Downloads: | 40875 |
|---|---|
| Visits: | 810656 |
Sponsors, Associates, and Links
-
Power Systems Computation
-
Internet of Things (IoT) and Engineering Applications
-
Computing, Performance and Communication Systems
-
Journal of Artificial Intelligence Practice
-
Journal of Network Computing and Applications
-
Journal of Web Systems and Applications
-
Journal of Electrotechnology, Electrical Engineering and Management
-
Journal of Wireless Sensors and Sensor Networks
-
Journal of Image Processing Theory and Applications
-
Mobile Computing and Networking
-
Vehicle Power and Propulsion
-
Frontiers in Computer Vision and Pattern Recognition
-
Knowledge Discovery and Data Mining Letters
-
Big Data Analysis and Cloud Computing
-
Electrical Insulation and Dielectrics
-
Crypto and Information Security
-
Journal of Neural Information Processing
-
Collaborative and Social Computing
-
International Journal of Network and Communication Technology
-
File and Storage Technologies
-
Frontiers in Genetic and Evolutionary Computation
-
Optical Network Design and Modeling
-
Journal of Virtual Reality and Artificial Intelligence
-
Natural Language Processing and Speech Recognition
-
Journal of High-Voltage
-
Programming Languages and Operating Systems
-
Visual Communications and Image Processing
-
Journal of Systems Analysis and Integration
-
Knowledge Representation and Automated Reasoning
-
Review of Information Display Techniques
-
Data and Knowledge Engineering
-
Journal of Database Systems
-
Journal of Cluster and Grid Computing
-
Cloud and Service-Oriented Computing
-
Journal of Networking, Architecture and Storage
-
Journal of Software Engineering and Metrics
-
Visualization Techniques
-
Journal of Parallel and Distributed Processing
-
Journal of Modeling, Analysis and Simulation
-
Journal of Privacy, Trust and Security
-
Journal of Cognitive Informatics and Cognitive Computing
-
Lecture Notes on Wireless Networks and Communications
-
International Journal of Computer and Communications Security
-
Journal of Multimedia Techniques
-
Automation and Machine Learning
-
Computational Linguistics Letters
-
Journal of Computer Architecture and Design
-
Journal of Ubiquitous and Future Networks

Download as PDF