Performing the Instruction
The following topics will be covered here:
A Simplified View of CPU Architecture
Before delving into the inner workings of the x86 CPU and it's clones and successors, it's worth considering the general architecture of a CPU from the ground up.
The diagram below shows a simplified view of a conceptual CPU that uses a single internal CPU bus. (Modern CPUs, however, are not single bus; this example is simplified to make the following discussion easier to follow.) Note that this bus is nothing to do with the memory data and address buses. As the diagram indicates, these have a separate feed into the CPU.

The important thing to remember here is that the CPU bus only allows data flow. It can not store data. Thus working data must be stored in the registers. It's also worth noting that in this single bus implementation, only one 'piece' of data can flow along the bus at any given time. Transference of data between the bus and any registers occurs once per clock cycle. This is because the the clock facilitates state switching of the transistors that make up the system.
The block of registers on the right hand side represents the 'general purpose' registers (where there are n of them) that this CPU has to work with.
I will now go on to explain how an instruction is performed, occasionally referring back to this diagram for reference.
Performing the Instruction
Performing an instruction actually involves several distinct steps. These are as follows:
-
Instruction fetch:
The Instruction Pointer holds the memory address of the next instruction to be
performed. This address is transferred to the Memory Address Register (MAR)
which is directly connected to the memory address bus. The CPU then issues a read request.
This causes the data stored at this memory address (or potentially in cache)
to be made available on the memory/system bus.
In a process known as strobing, this data is transferred to the
Memory Data Register (or some other input buffer, depending on the CPU architecture).
Remember that in this case, the data in question is the instruction itself.
(Once the data appears in the MAR, it is transferred to the instruction register.)
At this time, the IP is incremented to point to the memory address immediately following the instruction it previously pointed to. Note that x86 instructions are CISC in nature (see CISC vs RISC) and therefore instructions can be of variable lengths. The instruction pointer is always incremented by the correct number of bytes. Thus if a one byte instruction is executed, the pointer is incremented by one. If a three byte instruction is executed, it automatically gets incremented by three.
-
Instruction decode:
The instruction decoder determines what instruction is present
in the IR and selects signals accordingly for the execution stage which happens later.
-
Address generate:
If there are any instruction operands that reference data in memory,
then the fetch phase must be repeated to make that data available to
the CPU in a register. The CPU issues the read request, causing this
data to be read from memory (or cache).
-
Instruction execute:
At this point, any data to be manipulated by the CPU is 'gated' through the
Arithmetic and Logic Unit. (In the single bus example illustrated above,
this must happen via temporary registers.)
This is the heart of the CPU, where the logic of the instruction actually happens.
The actual nature of the instruction is determined by the control lines that feed the ALU.
- Write back: The result of the instruction - i.e. the output from the ALU - must be stored. Initially, this output is gated into another register. However, depending on the instruction, the final destination for the output may well be an address in main memory. In such a situation, the CPU must store the data using a process which is very nearly the exact reverse of a fetch. Note that in modern CPUs, output data destined for memory or cache typically gets placed in an output buffer first, freeing the CPU of the slow external data transfer.
One important fact to note about this process is that the fetch phase (and any write-back to main memory) is much slower than any other phase. Thus this step is generally the limiting factor when trying to make an instruction as fast as possible.
What's next
We will now go on to look at some of the higher level components of the CPU, such as the control unit, ALU, instruction units (and so on) in High Level Architecture.
| Just Too Good Last updated: June, 2006 (DJL) |
Drop me a line