5DV008
Computer Architecture
Umeà University
Department of Computing Science
Stephen J. Hegner
Topic 2: Instructions
Part C: Control Flow
Dhese slides are mostly taken verbatim, or with minor
changes, from those prepared by
Mary Jane Irwin (www.cse.psu.edu/~mji)
of The Pennsylvania State University
[Adapted from Computer Organization and Design, 4n Edition,
Patterson \& Hennessy, © 2008, MK]
$\qquad$
$\qquad$
$\qquad$
$\qquad$
$\qquad$
$\qquad$
$\qquad$

## Key to the Slides

$\square$ The source of each slide is coded in the footer on the right side:

- Irwin CSE331 = slide by Mary Jane Irwin from the course CSE331 (Computer Organization and Design) at Pennsylvania State University.
- Irwin CSE431 = slide by Mary Jane Irwin from the course CSE431 (Computer Architecture) at Pennsylvania State University.
- Hegner UU = slide by Stephen J. Hegner at Umeå University.
2

```
```

Sovoo8 20091111 t2C sl:2 Hegner UU

```
```

Sovoo8 20091111 t2C sl:2 Hegner UU

```


\section*{Review: I Format Instructions}

- Data transfer instructions
\begin{tabular}{|c|c|c|c|}
\hline 0x23 & 18 & 8 & 24. \\
\hline 0x2b & 18 & 8 & 24 \\
\hline
\end{tabular}
- Immediate instructions
\[
\text { addi } \$ t 0, \$ s 1,9
\]
andi \(\$ t 0, \$ s 1,0 x f f 00\) ori \(\$ t 0, \$ s 1,0 x f f 00\)
\begin{tabular}{|c|c|c|c|}
\hline \(0 x 0 \mathrm{c}\) & 17 & 8 & \(0 \times f f 00\) \\
\hline \(0 x 0 \mathrm{~d}\) & 17 & 8 & \(0 x f f 00\) \\
\hline
\end{tabular}

5DV008 20091111 t:2C sl:4
Irwin CSE331 PSU

\section*{MIPS Control Flow Instructions}
- MIPS conditional branch instructions:
\(\qquad\)
bne \(\$ s 0, \$ s 1\), Lbl \#go to Lbl if \(\$ s 0 \neq \$ s 1\)
beq \(\$ s 0, \$ s 1\), Lbl \#go to Lbl if \(\$ s 0=\$ s 1\) \(\qquad\)
- Ex: if (i==j) h = i + j;
bne \$s0, \$s1, Lbl1
add \$s3, \$s0, \$s1
Lbl1:
...
- Instruction Format (I format):
\begin{tabular}{|c|c|c|c|}
\hline op & rs & rt & 16-bit value \\
\begin{tabular}{|c|c|c|c|}
\hline \(0 \times 05\) & 16 & 17 & ??? \\
\hline
\end{tabular}
\end{tabular}\(.\)\begin{tabular}{l} 
a \\
\hline
\end{tabular}
- How is the branch destination address specified?

5DV008 20091111 t:2C sl:5

\section*{Specifying Branch Destinations}
\(\square\) Could specify the memory address of the branch target \(\qquad\)
- but that would require a 32-bit field
\(\square\) Could use a "base" register and add to it the 16-bit offset

- which register?

Instruction Address Register
(PC = program counter) - its use is automatically implied by branch
\(P C\) gets updated ( \(\mathrm{PC}+4\) ) during the Fetch cycle so that it holds the address of the next instruction
- limits the branch distance to
\(-2^{5}\) to \(+2^{5}-1\) instr's from the (instruction after the) branch
- but most branches are local anyway

\section*{Disassembling Branch Destinations}
\(\square\) The contents of the updated PC \((\mathrm{PC}+4)\) is added to the 16 bit branch offset which is converted into a 32-bit value by
- concatenating two low-order zeros to make it a word address and then sign-extending those 18 bits
\(\square\) The result is written into the PC if the branch condition is true as part of the Exec cycle - before the next Fetch cycle


5DV008 20091111 t:2C sl:7
Irwin CSE331 PSU

\section*{Offset Tradeoffs}
\(\square\) Why not just store the word offset in the low order 16 bits? Then the two low order zeros wouldn't have to be concatenated, it would be less confusing, ..
- That would limit the branch distance to \(-2^{13}\) to \(+2^{13}-1\) instructions from the (instruction after the) branch
\(\square\) And concatenating the two zero bits costs us very little in additional hardware and has no impact on the clock cycle time

\section*{SDV008 20091111 t:2C st:8}

\section*{Assembling Branches Example}
\(\square\) Assembly code
\[
\begin{aligned}
& \text { bne \$s0, \$s1, Lbl1 } \\
& \text { add \$s3, \$s0, \$s1 }
\end{aligned}
\]

Lbl1:
. . .
- Machine Format of bne:

\[
\begin{array}{|l|l|l|}
\hline 0 \times 05 & 16 & 17 \\
\hline
\end{array}
\]
- Remember
- After the bne instruction is fetched, the PC is updated so that it is addressing the add instruction
- The offset (plus 2 low-order zeros) is sign-extended and added to the (updated) PC

\section*{Assembling Branches Example}

\section*{- Assembly code}
\[
\begin{aligned}
& \text { bne \$s0, \$s1, Lbl1 } \\
& \text { add \$s3, \$s0, } \$ \mathrm{~s} 1
\end{aligned}
\]

Lbl1: ...
- Machine Format of bne:

- Remember
- After the bne instruction is fetched, the PC is updated so that it is addressing the add instruction
- The offset (plus 2 low-order zeros) is sign-extended and added to the (updated) PC

5DV008 20091111 t:2C sl:10
Irwin CSE331 PSU

\section*{In Support of Branch Instructions}
- We have beq, bne, but what about other kinds of branches (e.g., branch-if-less-than)? For this, we need yet another instruction, slt
- Set on less than instruction:
\begin{tabular}{lll} 
slt \$t0, \$s0, \$s1 & \begin{tabular}{lll}
\(\#\) if \(\$ s 0<\$ s 1\) & then \\
& \(\#\) \$t0 \(=1\) & else \\
& \(\# \$ t 0=0\) &
\end{tabular}
\end{tabular}
- Instruction format (R format):
\begin{tabular}{|l|l|l|l|l|l|}
\hline \(0 \times 00\) & 16 & 17 & 8 & & \(0 \times 24\) \\
\hline
\end{tabular}
- Alternate versions of slt
slti \$t0, \$s0, 25 \# if \$s0 < 25 then \(\$ t 0=1\)...
sltu \(\$ t 0, \$ s 0, \$ s 1 \quad \#\) if \(\$ s 0<\$ s 1\) then \(\$ t 0=1 \ldots\)
sltiu \$t0, \$s0, 25 \# if \$s0 < 25 then \(\$ t 0=1\)...
\(\qquad\)
\(\qquad\)
\(\qquad\)
\(\qquad\)
\(\qquad\)
\(\qquad\)

5DV008 20091111 t:2C sl:11

\section*{More Branch Instructions}
- Can use slt, beq, bne, and the fixed value of 0 in register \$zero to create other conditions
- less than blt \$s1, \$s2, Label
- less than or equal to ble \(\$ s 1, \$ s 2\), Label
- greater than bgt \(\$ s 1, \$ s 2\), Label
- great than or equal to bge \(\$ s 1, \$ s 2\), Label
\(\square\) Such branches are included in the instruction set as pseudo instructions - recognized (and expanded) by the assembler
- Its why the assembler needs a reserved register (\$at)

\section*{More Branch Instructions}
\(\square\) Can use slt, beq, bne, and the fixed value of 0 in register \$zero to create other conditions
- less than blt \$s1, \$s2, Label \(\qquad\)
slt \$at, \$s1, \$s2 \#\$at set to 1 if bne \$at, \$zero, Label \#\$s1 < \$s2
- less than or equal to ble \(\$ \mathbf{s} 1, \$ s 2\), Label
- greater than bgt \$s1, \$s2, Label
- great than or equal to bge \(\$ \mathbf{s} 1, \$\) 2, Label
\(\square\) Such branches are included in the instruction set as pseudo instructions - recognized (and expanded) by the assembler
- Its why the assembler needs a reserved register (\$at)

5DV008 20091111 t:2C sl:13

\section*{Another Instruction for Changing Flow}
- MIPS also has an unconditional branch instruction or
\(\qquad\) jump instruction:
j Lbl
\#go to Lbl
- Example:
if (i!=j)
\[
h=i+j ;
\]
else
h=i-j;
\begin{tabular}{lll} 
& beq & \(\$ s 0, \$ s 1\), Else \\
& add & \(\$ s 3, \$ s 0, \$ s 1\) \\
& \(j\) & Exit \\
Else: & sub & \(\$ s 3, \$ s 0, \$ s 1\) \\
Exit: &... &
\end{tabular}

SDV008 20091111 t:2C sl:14

\section*{Assembling Jumps}
- Instruction:
j Lbl \#go to Lbl
- Machine Format (J format):
\begin{tabular}{|c|c|}
\hline op & 26-bit address \\
\hline & \(\square\) \\
\hline \(0 \times 02\) & ???? \\
\hline
\end{tabular}
- How is the jump destination address specified?
- As an absolute address formed by
- concatenating 00 as the 2 low-order bits to make it a word address
concatenating the upper 4 bits of the current PC (now PC+4)

\section*{Disassembling Jump Destinations}
\(\square\) The low-order 26 bits of the jump instruction is converted into a 32-bit jump destination address by
- concatenating two low-order zeros to create an 28 bit (word) address and then concatenating the upper 4 bits of the current PC (now PC+4) to create a 32 bit (word) address
that is put into the PC prior to the next Fetch cycle


SDV008 20091111 t:2C sl:16
Irwin CSE331 PSU

\section*{Branching Far Awav}
\(\square\) What if the branch destination is further away than can be captured in 16 bits?
\(\square\) The assembler comes to the rescue - it inserts an unconditional jump to the branch target and inverts the condition
beq \(\$ \mathrm{~s} 0, \$ \mathrm{~s} 1, \mathrm{~L} 1\)
becomes
\[
\text { bne } \$ \mathrm{~s} 0, \$ \mathrm{~s} 1, \mathrm{~L} 2
\]
j L1

L2:

Svoos 20091111 t:2C s:17

\section*{Assembling Branches and Jumps}
\(\square\) Assemble the MIPS machine code for the following code sequence. Assume that the addr of the beq instr is 0x00400020
beq \$s0, \$s1, Else
\begin{tabular}{llll} 
& beq & \(\$ s 0, \$ s 1\), Else \\
add & \(\$ s 3, \$ s 0, \$ s 1\) \\
& \(j\) & \(E x i t\) \\
Else: & sub & \(\$ s 3, \$ s 0, \$ s 1\) \\
Exit: & \(\ldots\) &
\end{tabular}
\(\qquad\)
\(\qquad\)
\(\qquad\)
\(\qquad\)
\(\qquad\)

\section*{Assembling Branches and Jumps}
- Assemble the MIPS machine code for the following code sequence. Assume that the addr of the beq instr is \(0 \times 00400020_{\text {bx }}\)
\begin{tabular}{llll} 
& beq & \(\$ s 0\), & \(\$ s 1\), \\
& add & \(\$ s 3\), & \(\$ s 0\), \\
& \(j\) & Exi \\
Else: & sub & \(\$ s 3, \$ s 0, \$ s 1\) \\
Exit: & \(\ldots\) & &
\end{tabular}
\begin{tabular}{lllllllll}
\(0 \times 00400020\) & 4 & 16 & 17 & & 2 & \\
\(0 \times 00400024\) & 0 & 16 & 17 & 19 & 0 & \(0 \times 20\) \\
\(0 \times 00400028\) & 2 & 0000 & 0100 & 0 & \(\ldots\) & 0 & 0011 & \(00_{2}\)
\end{tabular}


\section*{Compiling While Loops}
- Compile the assembly code for the C while loop where \(i\) is in \(\$ s 0, j\) is in \(\$ s 1\), and \(k\) is in \(\$ s 2\)
```

while (i!=k)
i=i+j;

```
\(\square\) Basic block - A sequence of instructions without branches (except at the end) and without branch targets (except at the beginning)

\section*{Compiling While Loops}
\(\square\) Compile the assembly code for the C while loop where i is in \(\$ \mathrm{~s} 0, \mathrm{j}\) is in \(\$ \mathrm{~s} 1\), and k is in \(\$ \mathrm{~s} 2\)
```

        while (i!=k)
            i=i+j;
    ```
        Loop: beq \(\$ s 0\), \(\$ s 2\), Exit
            add \$s0, \$s0, \$s1
            j Loop
Exit: . . .
\(\square\) Basic block - A sequence of instructions without branches (except at the end) and without branch targets (except at the beginning)

\section*{Compiling Another While Loop}
\(\square\) Compile the assembly code for the C while loop where i is in \(\$ s 0, \mathrm{k}\) is in \(\$ s 1\), and the base address of the array save is in \(\$\) s2
```

while (save[i] == k)
i += 1;

```

\section*{Compiling Another While Loop}
- Compile the assembly code for the C while loop where i is in \(\$ s 0, \mathrm{k}\) is in \(\$ \mathrm{~s} 1\), and the base address of the array save is in \(\$\) s2
\[
\begin{gathered}
\text { while (save[i] }==\text { k) } \\
\text { i }+=1 \text {; }
\end{gathered}
\]
```

Loop: sll \$t1, \$s0, 2
add \$t1, \$t1, \$s2
lw $t0, 0($t1)
bne \$t0, \$s1, Exit
addi \$s0, \$s0, 1
j Loop
Exit:

```

SDV008 20091111 t:2C sl:23

\section*{Yet Another Instruction for Changing Flow}
- Most higher level languages have case or switch statements allowing the code to select one of many alternatives depending on a single value
- Instruction:
```

jr \$t1 \#go to address in \$t1

```
- Machine format (R format):

\(\qquad\)
\(\qquad\)
\(\qquad\)
\(\qquad\)


\section*{Programmina Styles}
- Procedures (subroutines, functions) allow the programmer to structure programs making them
- easier to understand and debug and
- allowing code to be reused
- Procedures allow the programmer to concentrate on one portion of the code at a time
- parameters act as barriers between the procedure and the rest of the program and data, allowing the procedure to be passed values (arguments) and to return values (results)

\section*{Six Steps in Execution of a Procedure}
1. Main routine (caller) places parameters in a place where the procedure (callee) can access them
- \$a0-\$a3: four argument registers
2. Caller transfers control to the callee
3. Callee acquires the storage resources needed
4. Callee performs the desired task
5. Callee places the result value in a place where the caller can access it
- \$v0 - \$v1: two value registers for result values
6. Callee returns control to the caller
- \$ra: one return address register to return to the point of origin
\begin{tabular}{|c|c|c|c|}
\hline \multicolumn{4}{|l|}{Review: MIPS Register Naming Convention} \\
\hline Nick Name & Register Number & Usage & Preserve on call? \\
\hline \$zero & 0 & constant 0 (hardware) & n.a. \\
\hline \$at & 1 & reserved for assembler & n.a. \\
\hline \$v0-\$v1 & 2-3 & returned values & no \\
\hline \$a0 - \$a3 & 4-7 & arguments & yes \\
\hline \$t0 - \$t7 & 8-15 & temporaries & no \\
\hline \$s0-\$s7 & 16-23 & saved values & yes \\
\hline \$t8 - \$t9 & 24-25 & temporaries & no \\
\hline \$k0 - \$k1 & 26-27 & reserved for OS & n.a. \\
\hline \$gp & 28 & global pointer & yes \\
\hline \$sp & 29 & stack pointer & yes \\
\hline \$fp & 30 & frame pointer & yes \\
\hline \$ra & 31 & return addr (hardware) & yes \\
\hline
\end{tabular}

\section*{Instruction for Calling a Procedure}
- MIPS procedure call instruction: \(\qquad\)
jal ProcAddress \#jump and link
- Saves PC+4 in register \$ra as the link to the following instruction to set up the procedure return
- Machine format (J format):

- Then can do procedure return with just \(\qquad\)
jr \$ra \#return

\section*{Basic Procedure Flow}
- For a procedure that computes the GCD of two values
\(\qquad\) i (in \$t0) and j (in \$t1)
\[
\operatorname{gcd}(i, j) ;
\]
\(\square\) The caller puts the \(i\) and \(j\) (the parameters values) in \$a0 and \$a1 and issues a
jal gcd \#jump to routine gcd
\(\square\) The callee computes the GCD, puts the result in \(\$ v 0\), and returns control to the caller using
\[
\begin{array}{cc}
\text { gcd: } & \text {. . } \\
\text { jr } \$ r a & \text { \#return }
\end{array}
\]


\section*{Allocating Space on the Stack}
\begin{tabular}{l} 
high addr \\
\begin{tabular}{l} 
Saved argument \\
regs (if any)
\end{tabular} \\
\hline
\end{tabular}
\(\square\) The segment of the stack containing a procedure's saved registers and local variables is its procedure frame (aka activation record)
- The frame pointer ( \(\$ f p\) ) points to the first word of the frame of a procedure providing a stable "base" register for the procedure
- \(\$ f p\) is initialized using \(\$ s p\) on a call and \(\$ \mathrm{sp}\) is restored using \(\$ \mathrm{fp}\) on a return
\(\qquad\)
\(\qquad\)
\(\qquad\)
\(\qquad\)
\(\qquad\)
\(\qquad\)

\section*{Allocating Space on the Heap}
\(\square\) There is a static data segment area for storing constants and other static variables (e.g., arrays)
- And a dynamic data segment (aka heap) area for structures that grow and shrink (e.g., linked lists)
- Allocate space on the
heap with malloc () and free it with
free () in C

\(\qquad\)
\(\qquad\)
\(\qquad\)
\(\qquad\)
\(\qquad\)
\(\qquad\)

\section*{Compiling a C Leaf Procedure}
- Leaf procedures are ones that do not call other procedures. Give the MIPS assembler code for
int leaf_ex (int \(g\), int \(h\), int \(i\), int \(j\) )
\{ int \(\overline{\mathrm{F}}\);
\(\mathrm{f}=(\mathrm{g}+\mathrm{h})-(i+j) ;\)
return f; \}
where g, h, i, and j are in \$a0, \$a1, \$a2, \$a3

\section*{Compiling a C Leaf Procedure}
\(\square\) Leaf procedures are ones that do not call other procedures. Give the MIPS assembler code for
int leaf_ex (int \(g\), int \(h, i n t i, i n t ~ j)\)
\{ int \(\bar{f}\);
f = (g+h) - (i+j);
return f; \}
where g, h, i, and j are in \$a0, \$a1, \$a2, \$a3
leaf_ex: addi \(\$ s p, \$ s p,-8\) \#make stack room
\begin{tabular}{|c|c|c|}
\hline sw & \$t1, 4 (\$sp) & \#save \$t1 on stack \\
\hline sw & \$t0,0 (\$sp) & \#save \$t0 on stack \\
\hline add & \$t0, \$a0, \$a1 & \\
\hline add & \$t1, \$a2, \$a3 & \\
\hline sub & \$v0, \$t0, \$t1 & \\
\hline lw & \$t0,0(\$sp) & \#restore \$t0 \\
\hline lw & \$t1, 4 (\$sp) & \#restore \$t1 \\
\hline addi & \$sp, \$sp, 8 & \#adjust stack ptr \\
\hline jr & \$ra & \\
\hline
\end{tabular}

SDV008 20091111 t:2C si:35
\$ra
SDV008 20091111 t:2C sl:35 Irwin CSE331 PSU
\(\qquad\)
\(\qquad\)
\(\qquad\)
\(\qquad\)
\(\qquad\)
\(\qquad\)
\(\qquad\)

Nested Procedures
\(\square\) What happens to return addresses with nested procedures?
int rt 1 (int i) \{
if \({ }^{-}(i==0)\) return 0 ;
else return rt_2(i-1); \}
caller: jal rt_1
next: . . .
rt_1: bne \$a0, \$zero, to_2
add \$v0, \$zero, \$zero
jr \$ra
to_2: addi \(\$ a 0, \$ a 0,-1\)
\(\begin{array}{ll}\text { jal } & \text { rt_2 } \\ j r & \text { \$ra }\end{array}\)
rt_2: . . .

Nested Procedures Outcome
```

caller: jal rt_1
next: . .
rt 1: bne \$a0, \$zero, to 2
_1: lone \$a0, \$zero, to_2
to_2: jr addi \$a0, \$a0, -1
jal rt
jr \$r\overline{a}
rt_2: . . .

```
- On the call to rt_1, the return address (next in the caller routine) gets stored in \$ra. What happens to the value in \$ra (when i ! = 0 ) when rt_1 makes a call to rt_2?

\section*{Saving the Return Address, Part 1}
\(\square\) Nested procedures (i passed in \(\$ \mathrm{a} 0\), return value in \(\$ \mathrm{v} 0\) )


Save the return address (and arguments) on the stack

\section*{Saving the Return Address. Part 1}
\(\square\) Nested procedures (i passed in \(\$ a 0\), return value in \(\$ \mathrm{v} 0\) )

\(\square\) Save the return address (and arguments) on the stack
\(\qquad\)

Saving the Return Address. Part 2
- Nested procedures (i passed in \(\$ \mathrm{a} 0\), return value in \(\$ \mathrm{v} 0\) )


\(\qquad\) \$ra
- Save the return address (and arguments) on the stack

5DV008 20091111 t:2C sl:40
Irwin CSE 331 PSU

\section*{Saving the Return Address, Part 2}
- Nested procedures (i passed in \(\$ \mathrm{a} 0\), return value in \(\$ \mathrm{v} 0\) )

ccaller rt addrls \({ }^{\text {ra }}\)
\(\square\) Save the return address (and arguments) on the stack
5DV008 20091111 t:2C sl:41 Irwin CSE331 PSU

\section*{Compiling a Recursive Procedure}
\(\square\) A procedure for calculating factorial int fact (int \(n\) ) \{ if ( \(\mathrm{n}<1\) ) return 1; else return (n * fact (n-1)); \}
\(\square\) A recursive procedure (one that calls itself!) fact ( 0 ) \(=1\)
fact (1) \(=1\) * \(1=1\)
fact (2) \(=2\) * 1 * \(1=2\)
fact (3) \(=3\) * 2 * 1 * \(1=6\)
fact (4) \(=4\) * 3 * 2 * 1 * \(1=24\)
\(\square\) Assume \(n\) is passed in \(\$ a 0\); result returned in \(\$ v 0\)


\(\qquad\)
\(\qquad\)
\(\qquad\)
\(\qquad\)
\(\qquad\)
\(\qquad\)
\(\qquad\)

A Look at the Stack for \(\$ \mathrm{a} 0=2\). Part 1

- Stack state after execution of the first encounter with jal (second call to fact routine with \(\$ \mathrm{a} 0\) now holding 1)
- saved return address to caller routine (i.e., location in the main routine where first call to fact is made) on the stack
- saved original value of \(\$ a 0\) on the stack
\(\qquad\)
\(\qquad\)
\(\qquad\)
\(\qquad\)
\(\qquad\)
\(\qquad\)
\(\qquad\)
\(\qquad\)
\(\qquad\)
\(\qquad\)
\(\qquad\)
\(\qquad\)
\(\qquad\)
\(\qquad\)

\(\qquad\)
\(\qquad\)
\(\qquad\)
\(\qquad\)
\(\qquad\)
\(\qquad\)
\(\qquad\)

A Look at the Stack for \(\$ a 0=2\). Part 3

- Stack state after execution of the first encounter with the first jr (\$v0 initialized to 1)
- stack pointer updated to point to third call to fact
\(\qquad\)
\(\qquad\)

\section*{A Look at the Stack for \(\$ 20=2\). Part 3}

- Stack state after execution of the first encounter with the first jr (\$v0 initialized to 1)
- stack pointer updated to point to third call to fact
\(\qquad\)
\(\qquad\)
\(\qquad\)
\(\qquad\)
\(\qquad\)
\(\qquad\)
\(\qquad\)

- Stack state after execution of the first encounter with the second jr (return from fact routine after updating \$v0 to 1 * 1)
- return address to caller routine ( \(b k_{-} f\) in fact routine) restored to \$ra from the stack
- previous value of \(\$ a 0\) restored from the stack
- stack pointer updated to point to second call to fact

\section*{A Look at the Stack for \(\$ a 0=2\). Part 4}

- Stack state after execution of the first encounter with the second jr (return from fact routine after updating \$vo to 1*1)
- return address to caller routine (bk_f in fact routine) restored to \(\$\) ra from the stack
- previous value of \(\$ a 0\) restored from the stack
- stack pointer updated to point to second call to fact
\(\qquad\)
\(\qquad\)
\(\qquad\)
\(\qquad\)
\(\qquad\)
\(\qquad\)

\section*{A Look at the Stack for \(\$ 20=2\). Part 5}


5DV008 20091111 t:2C sl:52
- Stack state after execution of the second encounter with the second jr (return from fact routine after updating \$v0 to 2 * 1 * 1)
- return address to caller routine (main routine) restored to \$ra from the stack
- original value of \(\$ \mathrm{aO}\) restored from the stack
- stack pointer updated to point to first call to fact
\(\qquad\)
\(\qquad\)
\(\qquad\)
\(\qquad\)
\(\qquad\)
\(\qquad\)
\(\qquad\)

\section*{A Look at the Stack for \(\$ \mathrm{a} 0=2\). Part 5}
\begin{tabular}{|c|c|}
\hline old Tos & \multirow[t]{9}{*}{\(\leftarrow\) \$sp} \\
\hline caller rt addr & \\
\hline \$a0 \(=2\) & \\
\hline bk f & \\
\hline \$a0 = 1 & \\
\hline \(\mathrm{bk}_{\mathrm{f}} \mathrm{f}\) & \\
\hline \$a0 \(=0\) & \\
\hline & \\
\hline & \\
\hline caller. rt addr & \$ra \\
\hline 2 & \$a0 \\
\hline \(2 * 1 * 1\) & \$v0 \\
\hline
\end{tabular}
- Stack state after execution of the second encounter with the second jr (return from fact routine after updating \$v0 to 2 * 1 * 1)
- return address to caller routine (main routine)
restored to \$ra from the stack
- original value of \(\$ \mathrm{a} 0\) restored from the stack
- stack pointer updated to point to first call to fact

Review: MIPS Instructions, so far
\begin{tabular}{|c|c|c|c|c|}
\hline Category & Instr & OpC & Example & Meaning \\
\hline \multirow[t]{11}{*}{Arithmetic (R\&I format)} & add & 0 \& 20 & add \$s1, \$s2, \$s3 & \$s1 = \$s2 + \$s3 \\
\hline & subtract & 0 \& 22 & sub \$s1, \$s2, \$s3 & \$s1 = \$s2-\$s3 \\
\hline & add immediate & 8 & addi \$s1, \$s2, 4 & \$s1 = \$s2 + 4 \\
\hline & shift left logical & 0 \& 00 & sll \$s1, \$s2, 4 & \$s1 \(=\) \$s2 <<4 \\
\hline & shift right logical & 0 \& 02 & srl \$s1, \$s2, 4 & \$s1 = \$s2 >> 4 (fill with zeros) \\
\hline & shift right arithmetic & 0 \& 03 & sra \$s1, \$s2, 4 & \$s1 = \$s2 >> 4 (fill with sign bit) \\
\hline & and & 0 \& 24 & and \$s1, \$s2, \$s3 & \$s1 = \$s2 \& \$ 3 \\
\hline & or & 0 \& 25 & or \$s1, \$s2, \$s3 & \$s1 = \$ 2 | \$ s3 \\
\hline & nor & 0 \& 27 & nor \$s1, \$s2, \$s3 & \$s1 = not (\$s2 | \$ s3) \\
\hline & and immediate & c & and \$s1, \$s2, ff00 & \$s1 = \$s2 \& 0xff00 \\
\hline & or immediate & d & or \$s1, \$s2, ff00 & \$s1 = \$s2 | 0xff00 \\
\hline
\end{tabular}
\(\qquad\)
\(\qquad\)
\(\qquad\)
\(\qquad\)
\(\qquad\)
\(\qquad\)

Review: MIPS Instructions, so far
\begin{tabular}{|c|c|c|c|c|}
\hline Category & Instr & OpC & Example & Meaning \\
\hline \multirow[t]{2}{*}{Data transfer (I format)} & load word & 23 & Iw \$s1, 100(\$s2) & \$s1 = Memory(\$s2+100) \\
\hline & store word & 2b & sw \$s1, 100(\$s2) & Memory(\$s2+100) = \$s1 \\
\hline \multirow[t]{4}{*}{Cond. branch (I \& R format)} & br on equal & 4 & beq \$s1, \$s2, L & if (\$s1==\$s2) go to L \\
\hline & br on not equal & 5 & bne \$s1, \$s2, L & if (\$s1 !=\$s2) go to L \\
\hline & set on less than immediate & a & \[
\begin{aligned}
& \text { slti } \$ \mathrm{~s} 1, \$ \mathrm{~s} 2, \\
& 100
\end{aligned}
\] & \[
\begin{aligned}
& \text { if }(\$ s 2<100) \$ s 1=1 ; \\
& \text { else } \quad \$ \mathrm{~s} 1=0
\end{aligned}
\] \\
\hline & set on less than & 0 \& 2a & slt \$s1, \$s2, \$s3 & \[
\begin{aligned}
& \text { if }(\$ s 2<\$ s 3) \$ s 1=1 ; \\
& \text { else } \quad \$ s 1=0
\end{aligned}
\] \\
\hline \multirow[t]{3}{*}{Uncond. jump} & jump & 2 & 2500 & go to 10000 \\
\hline & jump register & 0 \& 08 & jr \$t1 & go to \$t1 \\
\hline & jump and link & 3 & jal 2500 & go to 10000; \$ra=PC+4 \\
\hline
\end{tabular}

SDV008 20091111 t:2C sl:55 Irwin CSE331 PSU

\section*{Review: MIPS R3000 ISA}
- Instruction Categories
- Load/Store
- Computational
- Jump and Branch
- Floating Point
coprocessor
Memory Managemen
- Special
\begin{tabular}{|c|}
\hline Registers \\
\hline R0 - R31 \\
\hline PC \\
\hline \hline HI \\
\hline LO \\
\hline
\end{tabular}

3 Instruction Formats: all 32 bits wide
\begin{tabular}{|c|c|c|c|c|c|c|}
\hline 6 bits & 5 bits & 5 bits & 5 bits & 5 bits & 6 bits & \\
\hline OP & rs & rt & rd & shamt & funct & R format \\
\hline OP & rs & rt & \multicolumn{3}{|c|}{16 bit number} & I format \\
\hline OP & \multicolumn{5}{|c|}{26 bit jump target} & J format \\
\hline
\end{tabular}

SDV008 2009111 t.2C si:56
win CSE331 PSU

\section*{Atomic Exchange Support}
\(\square\) Need hardware support for synchronization mechanisms to avoid data races where the results of the program can change depending on how events happen to occur
- Two memory accesses from different threads to the same location, and at least one is a write
\(\square\) Atomic exchange (atomic swap) - interchanges a value in a register for a value in memory atomically, i.e., as one operation (instruction)
- Implementing an atomic exchange would require both a memory read and a memory write in a single, uninterruptable instruction. An alternative is to have a pair of specially configured instructions
```

ll $t1, 0($s1)
sc $t0, 0($s1)
\#store

```

11/26月ditional
sovoos 20091111 t:02 sl:57

\section*{Atomic Exchange with 11 and sc}

If the contents of the memory location specified by the 11
are changed before the sc to the same address occurs, are changed before the sc to the same address occurs, the sc fails (returns a zero)
```

try: add \$t0, \$zero, $s4 #$t0=\$s4 (exchange value)
ll $t1, O($s1) \#load memory value to \$t1
sc $t0, 0($s1) \#try to store exchange
\#value to memory, if fail
\#\$t0 will be 0
beq \$t0, \$zero, try \#try again on failure
add \$s4, \$zero, \$t1 \#load value in \$s4

```
\(\square\) If the value in memory between the 11 and the \(S c\) instructions changes, then sc returns a 0 in \(\$\) t0 causing the code sequence to try again.

\section*{11/23/09}

5DV008 20091111 t:02 sl:58 \(58 \quad\) Irwin CSE431 PSU


SDV008 20091111 t:2C sl:59
Irwin CSE331 PSU

Addressing Modes Illustrated

\(\qquad\)
\(\qquad\)
\(\qquad\)
\(\qquad\)
\(\qquad\)
\(\qquad\)
\(\qquad\)

MIPS Organization So Far
```

