How Does Data Get Into The Register File In Mips

viii Execution of a Complete Education – Datapath Implementation

Dr A. P. Shanthi

The objectives of this module are to discuss how an instruction gets executed in a processor and the datapath implementation, using the MIPS architecture as a case report.

The characteristics of the MIPS architecture is first of all summarized beneath:

• 32bit byte addresses aligned – MIPS uses 32 bi addresses that are aligned.

• Load/shop but displacement addressing – It is a load/store ISA or register/annals ISA, where only the load and store instructions utilise memory operands. All other instructions use only register operands. The addressing mode used for the memory operands is displacement addressing, where a displacement has to be added to the base register contents to get the effective accost.

• Standard information types – The ISA supports all standard data types.

• 32 GPRs – At that place are 32 general purpose registers, with register R0 always having 0.

• 32 FPRs – There are 32 floating point registers.

• FP status register – There ia floating point status register.

• No Condition Codes – MIPS compages does non back up condition codes.

• Addressing Modes – The addressing modes supported are Immediate, Deportation and Annals Style (used only for ALU)

3 stock-still length formats – In that location are iii 32-bit pedagogy formats that are supported. They are shown below in Figure 8.1.

Nosotros volition examine the MIPS implementation for a simple subset that shows most aspects of implementation. The instructions considered are:

The memory-reference instructions load give-and-take (lw) and store give-and-take (sw)
The arithmetic-logical instructions add, sub, and, or, and slt
The instructions branch equal (beq) and jump (j) to exist considered in the end.

This subset does not include all the integer instructions (for example, shift, multiply, and divide are missing), nor does it include any floating-point instructions. However, the key principles used in creating a datapath and designing the control will be illustrated. The implementation of the remaining instructions is similar. The primal design principles that we take looked at before can exist illustrated past looking at the implementation, such every bit the common guidelines, 'Make the common case fast' and 'Simplicity favors regularity'. In add-on, nigh concepts used to implement the MIPS subset are the same basic ideas that are used to construct a wide spectrum of computers, from high-performance servers to general-purpose microprocessors to embedded processors.

When nosotros look at the instruction cycle of any processor, it should involve the following operations:

Fetch instruction from memory
Decode the instruction
Fetch the operands
Execute the instruction
Write the result

We shall look at each of these steps in detail for the subset of instructions. Much of what needs to be washed to implement these instructions is the same, independent of the exact class of didactics. For every instruction, the first two steps of instruction fetch and decode are identical:

Ship the program counter (PC) to the program memory that contains the code and fetch the instruction
Read one or two registers, using the register specifier fields in the instruction. For the load word instruction, we need to read only one register, but virtually other instructions crave that we read two registers. Since MIPS uses a fixed length format with the register specifiers in the aforementioned place, the registers can be read, irrespective of the educational activity.

Afterward these two steps, the actions required to consummate the instruction depend on the type of education. For each of the three didactics classes, arithmetic/logical, retentivity-reference and branches, the actions are mostly the same. Even across unlike instruction classes there are some similarities. For instance, all instruction classes, except jump, use the arithmetic and logical unit, ALU after reading the registers. The load / store memory-reference instructions use the ALU for effective accost calculation, the arithmetic and logical instructions for the operation execution, and branches for condition evaluation, which is comparison here. As we tin see, the simplicity and regularity of the teaching prepare simplifies the implementation by making the execution of many of the didactics classes similar. After using the ALU, the actions required to complete various pedagogy classes differ. A memory-reference educational activity will demand to access the retention. For a load educational activity, a memory read has to be performed. For a store instruction, a memory write has to exist performed. An arithmetic/logical instruction must write the information from the ALU back into a register. A load teaching likewise has to write the information fetched form retention to a register. Lastly, for a branch instruction, we may need to change the next instruction address based on the comparing. If the status of comparison fails, the PC should be incremented by 4 to go the address of the adjacent instruction. If the status is true, the new accost volition have to updated in the PC. Effigy 8.2 beneath gives an overview of the CPU.

However, wherever we have two possibilities of inputs, nosotros cannot join wires together.

We have to use multiplexers as indicated below in Figure viii.iii.

We also need to include the necessary command signals. Figure 8.4 below shows the datapath, as well every bit the control lines for the major functional units. The control unit takes in the instruction as an input and determines how to set the control lines for the functional units and two of the multiplexors. The third multiplexor, which determines whether PC + 4 or the branch destination accost is written into the PC, is set based on the zero output of the ALU, which is used to perform the comparison of a branch on equal instruction. The regularity and simplicity of the MIPS instruction set means that a simple decoding process can be used to make up one's mind how to set up the control lines.

Just to give a brief section on the logic design basics, all of you know that information is encoded in binary as low voltage = 0, loftier voltage = ane and there is one wire per fleck. Multi-flake data are encoded on multi-wire buses. The combinational elements operate on information and the output is a part of input. In the case of state (sequential) elements, they store information and the output is a function of both inputs and the stored data, that is, the previous inputs. Examples of combinational elements are AND-gates, XOR-gates, etc. An instance of a sequential element is a register that stores information in a excursion. It uses a clock signal to determine when to update the stored value and is edge-triggered.

Now, nosotros shall discuss the implementation of the datapath. The datapath comprises of the elements that process data and addresses in the CPU – Registers, ALUs, mux's, memories, etc. We will build a MIPS datapath incrementally. Nosotros shall construct the basic model and go on refining it.

The portion of the CPU that carries out the instruction fetch operation is given in Figure 8.5.

Every bit mentioned earlier, The PC is used to accost the instruction memory to fetch the instruction. At the same time, the PC value is as well fed to the adder unit and added with 4, so that PC+4, which is the address of the next instruction in MIPS is written into the PC, thus making information technology ready for the next instruction fetch.

The adjacent step is pedagogy decoding and operand fetch. In the case of MIPS, decoding is washed and at the same fourth dimension, the register file is read. The processor'southward 32 general-purpose registers are stored in a structure called a annals file. A register file is a drove of registers in which any annals can be read or written by specifying the number of the register in the file.

The R-format instructions have 3 annals operands and we will demand to read two data words from the annals file and write ane information discussion into the register file for each educational activity. For each data word to exist read from the registers, we need an input to the annals file that specifies the register number to be read and an output from the annals file that will carry the value that has been read from the registers. To write a data word, we will demand two inputs- i to specify the register number to be written and one to supply the information to be written into the register. The 5-bit register specifiers betoken one of the 32 registers to be used.

The register file always outputs the contents of whatever register numbers are on the Read register inputs. Writes, however, are controlled past the write control betoken, which must be asserted for a write to occur at the clock edge. Thus, we need a total of 4 inputs (3 for register numbers and one for information) and two outputs (both for information), as shown in Effigy viii.half dozen. The annals number inputs are 5 bits wide to specify one of 32 registers, whereas the data input and 2 data output buses are each 32 bits wide.

After the 2 annals contents are read, the next stride is to pass on these ii data to the ALU and perform the required performance, as decided by the command unit of measurement and the control signals. Information technology might be an add, subtract or whatever other type of functioning, depending on the opcode. Thus the ALU takes two 32-scrap inputs and produces a 32-chip result, as well as a 1-fleck signal if the event is 0. The control signals will be discussed in the side by side module. For now, nosotros wil assume that the appropriate control signals are somehow generated.

The same arithmetic or logical performance with an immediate operand and a register operand, uses the I-type of education format. Hither, Rs forms one of the source operands and the immediate component forms the 2d operand. These two will have to be fed to the ALU. Before that, the 16-flake immediate operand is sign extended to form a 32-bit operand. This sign extension is done by the sign extension unit.

Nosotros shall next consider the MIPS load word and shop word instructions, which accept the general class lw $t1,offset_value($t2) or sw $t1,offset_value ($t2). These instructions compute a retentiveness accost past calculation the base register, which is $t2, to the 16-bit signed offset field contained in the teaching. If the teaching is a store, the value to exist stored must as well be read from the annals file where information technology resides in $t1. If the instruction is a load, the value read from retention must be written into the register file in the specified register, which is $t1. Thus, we will demand both the register file and the ALU. In addition, the sign extension unit volition sign extend the xvi-bit offset field in the pedagogy to a 32-bit signed value. The next performance for the load and store operations is the information memory access. The data memory unit has to be read for a load educational activity and the information memory must be written for store instructions; hence, it has both read and write control signals, an address input, also every bit an input for the information to be written into retention. Effigy 8.vii higher up illustrates all this.

The branch on equal education has three operands, ii registers that are compared forequality, and a 16-bit starting time used to compute the co-operative target address, relative to the branch instruction accost. Its grade is beq $t1, $t2, beginning. To implement this instruction, we must compute the branch target address past adding the sign-extended first field of the instruction to the PC. The instruction set architecture specifies that the base of operations for the co-operative address calculation is the accost of the instruction post-obit the branch. Since we have already computed PC + iv, the address of the next didactics, in the pedagogy fetch datapath, it is piece of cake to use this value as the base for computing the branch target address. Also, since the word boundaries have the two LSBs equally zeros and branch target addresses must first at word boundaries, the starting time field is shifted left two bits. In improver to computing the branch target address, we must likewise make up one's mind whether the next instruction is the instruction that follows sequentially or the instruction at the branch target accost. This depends on the status being evaluated. When the condition is truthful (i.e., the operands are equal), the co-operative target address becomes the new PC, and we say that the co-operative is taken. If the operands are not equal, the incremented PC should supercede the current PC (just as for whatever other normal instruction); in this case, we say that the branch is non taken.

Thus, the branch datapath must do two operations: compute the branch target accost and compare the register contents. This is illustrated in Figure 8.eight. To compute the co-operative target address, the branch datapath includes a sign extension unit and an adder. To perform the compare, we need to utilise the register file to supply the two register operands. Since the ALU provides an output point that indicates whether the result was 0, nosotros can send the two annals operands to the ALU with the command gear up to do a decrease. If the Nothing point out of the ALU unit is asserted, we know that the 2 values are equal. Although the Zero output always signals if the issue is 0, we will exist using it only to implement the equal examination of branches. Later, nosotros will show exactly how to connect the control signals of the ALU for utilise in the datapath.

Now, that we have examined the datapath components needed for the individual pedagogy classes, we can combine them into a single datapath and add the command to consummate the implementation. The combined datapath is shown Figure viii.9 beneath.

The simplest datapath might attempt to execute all instructions in one clock wheel. This ways that no datapath resource can be used more than once per pedagogy, so any element needed more once must be duplicated. We therefore need a memory for instructions separate from i for data. Although some of the functional units will need to be duplicated, many of the elements tin be shared by unlike instruction flows. To share a datapath chemical element between 2 different instruction classes, nosotros may demand to let multiple connections to the input of an element, using a multiplexor and control signal to select among the multiple inputs. While adding multiplexors, we should note that though the operations of arithmetic/logical ( R-type) instructions and the retentivity related instructions datapath are quite similar, there are certain primal differences.

The R-type instructions employ two register operands coming from the annals file. The retentivity instructions also use the ALU to do the address adding, but the 2d input is the sign-extended 16-bit offset field from the pedagogy.
The value stored into a destination register comes from the ALU for an R-blazon instruction, whereas, the information comes from memory for a load.

To create a datapath with a mutual register file and ALU, we must support ii different sources for the second ALU input, equally well every bit 2 different sources for the information stored into the annals file. Thus, one multiplexor needs to be placed at the ALU input and another at the information input to the register file, equally shown in Figure 8.x.

We take discussed the individual instructions – arithmetic/logical, retentiveness related and branch. Now we can combine all the pieces to make a elementary datapath for the MIPS architecture by adding the datapath for pedagogy fetch, the datapath from R-type and retentiveness instructions and the datapath for branches. Figure below shows the datapath we obtain by combining the separate pieces. The branch instruction uses the main ALU for comparison of the register operands, so we must go on the adder shown earlier for calculating the co-operative target address. An boosted multiplexor is required to select either the sequentially following instruction accost, PC + 4, or the branch target accost to be written into the PC.

To summarize, we take looked at the steps in the execution of a complete instruction with MIPS every bit a instance study. We have incrementally synthetic the datapath for the Arithmetic/logical instructions, Load/Store instructions and the Branch educational activity. The implementation of the jump instruction to the datapath and the control path implementation will be discussed in the next module.

Spider web Links / Supporting Materials

Computer Organization and Design – The Hardware / Software Interface, David A. Patterson and John Fifty. Hennessy, fourth.Edition, Morgan Kaufmann, Elsevier, 2009.
Computer Organization, Carl Hamacher, Zvonko Vranesic and Safwat Zaky, 5th.Edition, McGraw- Hill Higher Didactics, 2011.