Design and implementation of reorder buffer in superscalar pipeline processor
DOI:
https://doi.org/10.54097/bhcm1856Keywords:
Superscalar Processor, Reorder Buffer (ROB), Performance optimization.Abstract
With the increasing demand for high performance and low power consumption in System-on-Chip (SoC) scenarios such as mobile terminals and servers, superscalar processors have become the core computing unit of high-performance SoCs by virtue of their multi-instruction parallel capabilities. However, the out-of-order execution mechanism of superscalar processors leads to contradictions between "out-of-order completion" and "in-order commit" of instructions, as well as difficulties in state recovery when branch prediction fails or exceptions occur. To solve these problems, this paper designs and implements a Reorder Buffer (ROB) module for 4-way issue superscalar pipelines. The ROB module adopts a parameterized design (data width 64bit, depth 32 entries) to realize functions such as instruction temporary storage, in-order commit, data forwarding, branch flush, and exception handling. Functional verification is carried out, covering 9 core scenarios including basic instruction issue, execution result writeback, branch flush, and post-flush instruction re-issue. Based on the 7nm FinFET process library for logic synthesis, the results show that the module achieves complete sequential convergence (slack 0.003ns) with a maximum operating frequency of 1607.72MHz, a total area of 10760.880μm², and a total dynamic power consumption of 4.6127mW. This ROB module can stably support the out-of-order execution and in-order commit of 4 instructions per cycle, providing a reliable core structure for high-performance 4-way superscalar processors.
Downloads
References
[1] LI Zhao, LIU Youyao, JIAO Jiye, et al. Superscalar processor out-of-order submit mechanism research and design. Computer engineering, 2021, 47 (4): 180-186.
[2] LI Wenzhe. Design and optimization of Register Renaming Mechanism for out-of-order superscalar Processor. National University of Defense Technology, 2015.
[3] SRavindra P. Rajput, M.N. Shanmukha Swamy, Superscalar pipelined inner product computation unit for signed unsigned number, Perspectives in Science, 2016, 8: 606-610.
[4] K.S. Loh, W.F. Wong, Multiple context multithreaded superscalar processor architecture, Journal of Systems Architecture, 2000, 46(3): 243-258.
[5] Chao-Chin Wu, Embedding a superscalar processor onto a chip multiprocessor, Microprocessors and Microsystems, 2004, 28(4): 147-156.
[6] Faheem Sheikh, Shahid Masud, Rehan Ahmed, Superscalar architecture design for high performance DSP operations, Microprocessors and Microsystems, 2009, 33(2): 154-160.
[7] Kiyeon Lee, Sangyeun Cho, Accurately modeling superscalar processor performance with reduced trace, Journal of Parallel and Distributed Computing, 2013, 73(4): 509-521.
[8] Li Xiaoming, Yang Jun, Meng Jianyi. A Retirement Scheme for ROB to Achieve Fast Instruction Completion. Computer Engineering and Applications, 2015, 51(24): 40-44.
[9] Zhang Shiyuan, Yu Lixin. Analysis of the Impact of Branch Prediction on the Performance of Superscalar Pipelines. Microelectronics & Computer, 2015, 32(08): 167-171 + 176.
[10] Zhang He. The study of the reorder buffer of the superscalar processor. Information aspect, 2009, 28 (16): 16-18.
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Academic Journal of Science and Technology

This work is licensed under a Creative Commons Attribution 4.0 International License.








