Design and implementation of reorder buffer in superscalar pipeline processor

Chenyang Mao

doi:10.54097/bhcm1856

Authors

Chenyang Mao

DOI:

https://doi.org/10.54097/bhcm1856

Keywords:

Superscalar Processor, Reorder Buffer (ROB), Performance optimization.

Abstract

With the increasing demand for high performance and low power consumption in System-on-Chip (SoC) scenarios such as mobile terminals and servers, superscalar processors have become the core computing unit of high-performance SoCs by virtue of their multi-instruction parallel capabilities. However, the out-of-order execution mechanism of superscalar processors leads to contradictions between "out-of-order completion" and "in-order commit" of instructions, as well as difficulties in state recovery when branch prediction fails or exceptions occur. To solve these problems, this paper designs and implements a Reorder Buffer (ROB) module for 4-way issue superscalar pipelines. The ROB module adopts a parameterized design (data width 64bit, depth 32 entries) to realize functions such as instruction temporary storage, in-order commit, data forwarding, branch flush, and exception handling. Functional verification is carried out, covering 9 core scenarios including basic instruction issue, execution result writeback, branch flush, and post-flush instruction re-issue. Based on the 7nm FinFET process library for logic synthesis, the results show that the module achieves complete sequential convergence (slack 0.003ns) with a maximum operating frequency of 1607.72MHz, a total area of 10760.880μm², and a total dynamic power consumption of 4.6127mW. This ROB module can stably support the out-of-order execution and in-order commit of 4 instructions per cycle, providing a reliable core structure for high-performance 4-way superscalar processors.

Downloads

Download data is not yet available.

References

[1] LI Zhao, LIU Youyao, JIAO Jiye, et al. Superscalar processor out-of-order submit mechanism research and design. Computer engineering, 2021, 47 (4): 180-186.

[2] LI Wenzhe. Design and optimization of Register Renaming Mechanism for out-of-order superscalar Processor. National University of Defense Technology, 2015.

[3] SRavindra P. Rajput, M.N. Shanmukha Swamy, Superscalar pipelined inner product computation unit for signed unsigned number, Perspectives in Science, 2016, 8: 606-610.

[4] K.S. Loh, W.F. Wong, Multiple context multithreaded superscalar processor architecture, Journal of Systems Architecture, 2000, 46(3): 243-258.

[5] Chao-Chin Wu, Embedding a superscalar processor onto a chip multiprocessor, Microprocessors and Microsystems, 2004, 28(4): 147-156.

[6] Faheem Sheikh, Shahid Masud, Rehan Ahmed, Superscalar architecture design for high performance DSP operations, Microprocessors and Microsystems, 2009, 33(2): 154-160.

[7] Kiyeon Lee, Sangyeun Cho, Accurately modeling superscalar processor performance with reduced trace, Journal of Parallel and Distributed Computing, 2013, 73(4): 509-521.

[8] Li Xiaoming, Yang Jun, Meng Jianyi. A Retirement Scheme for ROB to Achieve Fast Instruction Completion. Computer Engineering and Applications, 2015, 51(24): 40-44.

[9] Zhang Shiyuan, Yu Lixin. Analysis of the Impact of Branch Prediction on the Performance of Superscalar Pipelines. Microelectronics & Computer, 2015, 32(08): 167-171 + 176.

[10] Zhang He. The study of the reorder buffer of the superscalar processor. Information aspect, 2009, 28 (16): 16-18.