Skip to content


The B32P has a 5-stage pipeline, very similar to a MIPS CPU. The stages are as follows:

  • FE (fetch): Obtain instruction from memory bus
  • DE (decode): Decode instruction and read register bank
  • EX (execute): ALU operation and jump/branch calculation
  • MEM (memory access): Stack access and memory bus access
  • WB (write back): Write to register bank

A full schematic overview of how all the CPU components fall within this pipeline can be seen in this image. (You might want to right-click the image and open in a new tab to zoom in) CPU architecture

Hazard detection and branch prediction

The CPU detects pipeline hazards, removing the need for the programmer to account for this, by doing the following things depending on the situation:

  • Flush (if jump/branch/halt)
  • Stall (if EX uses result of MEM)
  • Forwarding (MEM -> EX and WB -> EX)

Branch prediction is done by always assuming that the branch did not pass (which is the easiest to implement).

Memory conflicts

Because the FPGC does not have a separate instruction and data memory, the FE and MEM stages could need access to the memory bus at the same time. To handle this, an arbiter is used that gives priority to the MEM request, since these only occur for READ/WRITE instructions, and lets the FE request stall until the bus is free.

Memory latency

Access to memory via the memory bus will take a variable amount of cycles depending on the memory type that is being accessed. The pipeline will stall during this delay.