### Introduction to Computer Architecture: exam

#### R. Pacalet

#### 2023-11-30

The text in black is the original one. The text in red is examples of the expected correct answers. Only this text was expected, possibly in shorter form, nothing more. The text in blue is extra comments about the expected correct answers. Warning: the course changes frequently (content, vocabulary, examples...); some questions and answer proposals can thus be partly or completely out of scope. Warning: some questions can be answered in many different ways; the proposed answers are just examples and they are not exhaustive.

You can use any document but communicating devices are strictly forbidden. Please number the different pages of your paper and indicate on each page your first and last names. You can write your answers in French or in English, as you wish. Precede your answers with the question's number. If some information or hypotheses are missing to answer a question, add them. If you consider a question as absurd and thus decide to not answer, explain why. If you do not have time to answer a question but know how to, briefly explain your ideas. Note: copying verbatim the slides of the lectures or any other provided material is not considered as a valid answer. Advice: quickly go through the document and answer the easy parts first.

### 1 CMOS logic (6 points)

In the CMOS logic gate of Figure 1 a, b, c are the 3 inputs, x is the output.



Figure 1: CMOS schematic of logic gate

- 1. Write its truth table, that is, the table giving the value of x for each of the 8 value combinations of a, b and c.
- 2. Write the corresponding boolean equation using only symbols a, b, c, parentheses and the boolean operators NOT, OR, AND. Do not assume precedence between boolean operators, use parentheses to make your equation non ambiguous. Example of non ambiguous boolean equation: NOT(((NOT a) AND c) OR b).
- 1. The truth table is:

| $\overline{a}$ | b | c | $\overline{x}$ |
|----------------|---|---|----------------|
| 0              | 0 | 0 | 0              |
| 0              | 0 | 1 | 0              |
| 0              | 1 | 0 | 0              |
| 0              | 1 | 1 | 1              |
| 1              | 0 | 0 | 0              |
| 1              | 0 | 1 | 1              |
| 1              | 1 | 0 | 1              |
| 1              | 1 | 1 | 1              |

Building the truth table is straightforward after we observe that there are 2 logic gates in one: the two rigthmost transistors form an inverter that inverts the output of the left part. We can thus first express the condition on the inputs values for the output of the left gate to be 0, that is, the condition for the network of N transistors to be passing: (a AND (b OR c)) OR (b and c). Next we fill the table with the inverse (because of the inverter on the right), that is, set x to 1 when the condition is true, 0 when it is false.

2. x = (a AND (b OR c)) OR (b AND c)

#### 2 RISC-V assembly (8 points)

In this question we use RV32IM, the RISC-V Instruction Set Architecture (ISA) and the ILP32 Application Binary Interface (ABI) seen during lectures and labs. Reminder: the size of a stack frame **must** be at least 32 bytes and **must** be a multiple of 32 bytes; the general purpose **saved** registers are **sp**, **gp**, **tp**, **s0**, **s1**, ..., **s11**. Use the provided RISC-V cheat sheet if you don't remember the RV32IM ISA or the ILP32 ABI.

Functions **foo** and **bar** take a 32 bits input and return a 32 bits result. **bar** has already been coded in RV32IM assembly. It fully complies with the ILP32 ABI but you **do not know how it has been coded**: it possibly modifies **any** non-saved general purpose register (except, of course, the **zero** register).

foo calls bar twice: first on its input parameter and then on the result. It returns the bitwise AND of the 2 bar results. The pseudo-code of foo could be the following, in which val represents the input of foo:

```
foo(val) {
```

```
tmp = bar(val)
return tmp AND bar(tmp)
}
```

Write the code of function **foo** in RV32IM assembly. Your code **must** fully comply with the ILP32 ABI. Comment each instruction on the same line after a **#** sign as in the following example:

```
t0, t1, t2 + t0 = t1 + t2
.text
foo:
                    # allocate 32 bytes stack frame
  addi sp,sp,-32
       ra,0(sp)
                      save ra in stack frame
       s0,4(sp)
                      save s0 in stack frame
  SW
  jal
       ra,bar
                    # a0 = bar(val)
                    # s0 = bar(val)
       s0,a0
 mv
                    #
                      a0 = bar(bar(val))
  jal
       ra,bar
       a0,s0,a0
                    #
                      a0 =
                           bar(val) AND bar(bar(val))
  and
 lw
       s0,4(sp)
                      restore s0 from stack frame
                      restore ra from stack frame
 lw
       ra,0(sp)
  addi sp, sp, 32
                    # restore sp
  jalr zero,0(ra)
                    # return
```

As for any function we first allocate a stack frame to store the return address (in register ra) and all saved register that we intend to use (register s0 in our case). 32 bytes are more than the needed 8 bytes but the recommendation is that the size of the stack frames shall be a multiple of 32 bytes. Note: in the real ABI the recommendation is 16 bytes, not 32. As always the stack grows towards the low addresses so we subtract 32 from the stack pointer (register sp). sp now contains the base address of the new stack frame.

Next we save registers **ra** and **s0** in the stack frame, respectively at offset 0 and 4 from the base address of the new stack frame.

Next we call function bar. The input parameter is already in register a0 because it is the same as the input parameter of function foo and we did not modify it since the entry in foo. The call is simply jal ra, bar, or the equivalent pseudo-instruction call bar, that stores the return address in ra and jumps at the first instruction of bar. After the function returns we must save the result (also in register a0) because we will need it for the final computation while the second call to bar will overwrite register a0. So, we copy it in saved register s0. We cannot use a temporary register or one of the a0, ..., a7 registers because these are not saved registers and the call to bar could modify them. This is the reason why we chose s0, and also why we first saved this register in the stack frame: we modify its content so we must be able to restore it before we return.

Next we call function **bar** a second time. Again, the input parameter is already in register **a0** because it is the result of the first call and we did not modify it. After the function returns the result is in register **a0**, the result of the first call is in register **s0** (and the call did not modify it because **s0** is a saved register).

We thus simply compute the bitwise AND of **a0** and **s0**, and store the result in **a0**, because it is where any function like **foo** must store its result.

The final part simply consists in restoring **ra** and **s0** from the stack frame, restoring **sp** by adding the opposite of what was subtracted when entering the function (32), and returning with **jalr zero,0(ra)** or the equivalent pseudo-instruction **ret**.

Note: instead of using  ${\tt s0}$  to store the result of the first call we could have saved it in the stack frame. There would be no need to save, use, and restore  ${\tt s0}$ . This new version has 1 instruction less than the other:

```
.text
foo:
                    # allocate 32 bytes stack frame
  addi sp, sp, -32
       ra,0(sp)
                      save ra in stack frame
  SW
  jal
       ra,bar
                      a0 = bar(val)
       a0,4(sp)
                     save a0 in stack frame
  SW
       ra,bar
                      a0 = bar(bar(val))
  jal
  lw
       t0,4(sp)
                      restore first result in t0 from stack frame
  and
       a0,t0,a0
                      a0 = bar(val) AND bar(bar(val))
 ٦w
       ra,0(sp)
                      restore ra from stack frame
  addi sp, sp, 32
                      restore sp
  jalr zero,0(ra)
                    # return
```

#### 3 Binary representation of numbers (6 points)

All given integer values are in base 10.

- 1. We want to represent -256, +267, -169 and +63 in 2's complement on the same number of bits n. What is the minimum value of n (only one answer)?
- 2. Write the *n* bits 2's complement representation of -256, +267, -169 and +63.
- 3. We consider a 32 bits signed integer A which 2's complement representation  $a_{31}a_{30}\ldots a_1a_0$  is stored in RISC-V general purpose register t0. We shift t0 to the left by 2 positions with the RISC-V instruction slli t0,t0,2 and denote B the new t0 content, considered as a 32 bits signed integer in 2's complement. Under what condition on A do we have  $B = 4 \times A$ ?

```
1. n = 10
2. 1100000000, 0100001011, 1101010111 and 0000111111
3. a_{31} = a_{30} = a_{29}, that is, -2^{29} \le A < 2^{29}
```

- 1. n=10 because +267 has the largest absolute value of the 4 numbers, and 9 bits can only represent numbers in the  $[-2^8 \dots 2^8 1]$  range, that is,  $[-256 \dots 255]$ . This is not enough while with 10 bits we can represent numbers in the  $[-2^9 \dots 2^9 1]$  range, that is,  $[-512 \dots 511]$ .
- 2. Converting from decimal to 2's complement on 10 bits simply consists in decomposing the absolute values in powers of 2, and, for the negative numbers, taking the one's complement and adding 1. Example for -256:

 $256 = 2^8 = 0100000000_2 \rightarrow 101111111111 + 1 = 11000000000$ . Example for 267:  $267 = 256 + 8 + 2 + 1 = 2^8 + 2^3 + 2^1 + 2^0 = 0100001011_2$ .

3. The value of A is:

$$A = -2^{31} \times a_{31} + \sum_{i=0}^{i=30} 2^i \times a_i$$

When shifting to the left, the two leftmost bits  $(a_{31} \text{ and } a_{30})$  are dropped and two 0 bits enter on the right. So, the value of B is:

$$B = -2^{31} \times a_{29} + \sum_{i=0}^{i=28} 2^{i+2} \times a_i$$

$$= 2^2 \times \left( -2^{29} \times a_{29} + \sum_{i=0}^{i=28} 2^i \times a_i \right)$$

$$= 2^2 \times \left( A + 2^{31} \times a_{31} - 2^{30} \times a_{30} - 2^{29} \times a_{29} - 2^{29} \times a_{29} \right)$$

$$= 2^2 \times \left( A + 2^{31} \times a_{31} - 2^{30} \times a_{30} - 2^{30} \times a_{29} \right)$$

So,  $B=4\times A$  if and only is  $2^{31}\times a_{31}-2^{30}\times a_{30}-2^{30}\times a_{29}=0$ , that is,  $a_{31}=a_{30}=a_{29}$ . The 3 leftmost (most significant) bits of A must be equal. We can thus rewrite A as:

$$A = (-2^{31} + 2^{30} + 2^{29}) \times a_{29} + \sum_{i=0}^{i=28} 2^{i} \times a_{i}$$
$$= -2^{29} \times a_{29} + \sum_{i=0}^{i=28} 2^{i} \times a_{i}$$
$$\Rightarrow -2^{29} \le A < 2^{29}$$

Said differently,  $B=4\times A$  if and only if the 32 bits number A would also fit on 30 bits only.

# RISC-V Instruction-Set

Erik Engheim <erik.engheim@ma.com

## Arithmetic Operation

| Description | rd ← rs1 + rs2   | rd ← rs1 - rs2   | rd ← rs1 + imm12    | rd ← rs1 < rs2 ? 1 : 0 | rd ← rs1 < imm12 ? 1 : 0   | rd ← rs1 < rs2 ? 1 : 0 | rd ← rs1 < imm12 ? 1 : 0            | rd ← imm20 << 12     | rd ← PC + imm20 << 12        |
|-------------|------------------|------------------|---------------------|------------------------|----------------------------|------------------------|-------------------------------------|----------------------|------------------------------|
| Туре        | œ                | œ                | -                   | œ                      | _                          | œ                      | _                                   | ם                    | n                            |
| Instruction | Add              | Subtract         | Add immediate       | Set less than          | Set less than<br>immediate | Set less than unsigned | Set less than<br>immediate unsigned | Load upper immediate | Add upper immediate<br>to PC |
| Mnemonic    | ADD rd, rs1, rs2 | SUB rd, rs1, rs2 | ADDI rd, rs1, imm12 | SLT rd, rs1, rs2       | SLTI rd, rs1, imm12        | SLTU rd, rs1, rs2      | SLTIU rd, rs1, imm12                | LUI rd, imm20        | AUIP rd, imm20               |

### Logical Operations

| Mnemonic            | Instruction                      | Type | Description       |
|---------------------|----------------------------------|------|-------------------|
| AND rd, rs1, rs2    | AND                              | ~    | rd ← rs1 & rs2    |
| OR rd, rs1, rs2     | OR                               | œ    | rd ← rs1   rs2    |
| XOR rd, rs1, rs2    | XOR                              | œ    | rd ← rs1 ^ rs2    |
| ANDI rd, rs1, imm12 | AND immediate                    | _    | rd ← rs1 & imm12  |
| ORI rd, rs1, imm12  | OR immediate                     | -    | rd ← rs1   imm12  |
| XORI rd, rs1, imm12 | XOR immediate                    | _    | rd ← rs1 ^ imm12  |
| SLL rd, rs1, rs2    | Shift left logical               | œ    | rd ← rs1 << rs2   |
| SRL rd, rs1, rs2    | Shift right logical              | œ    | rd ← rs1 >> rs2   |
| SRA rd, rs1, rs2    | Shift right arithmetic           | œ    | rd ← rs1 >> rs2   |
| SLLI rd, rs1, shamt | Shift left logical<br>immediate  | _    | rd ← rs1 << shamt |
| SRLI rd, rs1, shamt | Shift right logical imm.         | _    | rd ← rs1 >> shamt |
| SRAI rd, rs1, shamt | Shift right arithmetic immediate | _    | rd ← rs1 >> shamt |

### Branching

S

Store byte

S

Store halfword

SH rs2, imm12(rs1) SB rs2, imm12(rs1)

| Description | if rs1 = rs2<br>pc ← pc + inm12 | if rs1 ≠ rs2<br>pc ← pc + imm12 | if rs1 ≥ rs2<br>pc ← pc + imm12 | if rs1 >= rs2<br>pc ← pc + imm12         | if rs1 < rs2<br>pc ← pc + imm12 | if rs1 < rs2<br>pc ← pc + imm12 << 1 | rd ← pc + 4<br>pc ← pc + imm20 | rd ← pc + 4<br>pc ← rs1 + imm12 |
|-------------|---------------------------------|---------------------------------|---------------------------------|------------------------------------------|---------------------------------|--------------------------------------|--------------------------------|---------------------------------|
| Type        | SS                              | 8S                              | 88                              | 88                                       | 88                              | 88                                   | 3                              | -                               |
| Instruction | Branch equal                    | Branch not equal                | Branch greater than or equal    | Branch greater than or<br>equal unsigned | Branch less than                | Branch less than<br>unsigned         | Jump and link                  | Jump and link register          |
| Mnemonic    | BEQ rs1, rs2, imm12             | BNE rs1, rs2, imm12             | BGE rs1, rs2, imm12             | BGEU rs1, rs2, imm12                     | BLT rs1, rs2, imm12             | BLTU rs1, rs2, imm12                 | JAL rd, imm20                  | JALR rd, imm12(rs1)             |

# 32-bit instruction format

| 0                                                                                |        |           |           |           |
|----------------------------------------------------------------------------------|--------|-----------|-----------|-----------|
| -                                                                                |        |           |           |           |
| 2                                                                                | de     | de        | de        | de        |
| 3                                                                                | opcodo | obcode    | opcode    | opcode    |
| 4                                                                                | ٥      | 0         | 0         |           |
| 2                                                                                |        |           |           |           |
| 9                                                                                |        |           |           |           |
| 7                                                                                |        |           |           |           |
| 8                                                                                |        |           | immediate |           |
| 6                                                                                | 5      | 2         | med       | 5         |
| 10                                                                               |        |           | <u>=</u>  |           |
| Ξ                                                                                |        |           |           |           |
| 12                                                                               |        |           |           |           |
| 13                                                                               | func   | func      | func      |           |
| 4                                                                                | _      | _         | _         |           |
| 15                                                                               |        |           |           |           |
| 16                                                                               |        |           |           |           |
| 17                                                                               | rs1    | rs1       | rs1       |           |
| 18                                                                               |        |           |           |           |
| 19                                                                               |        |           |           |           |
| 20                                                                               |        |           |           |           |
| 21                                                                               |        |           |           |           |
| 22                                                                               | rs2    |           | rs2       |           |
| 23                                                                               |        |           |           |           |
| 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 |        |           |           |           |
| 25                                                                               |        |           |           |           |
| 56                                                                               |        |           |           |           |
| 27                                                                               |        | ate       | ate       | ate       |
| 28                                                                               | func   | edië      | edië      | edi       |
| 59                                                                               | Ψ.     | immediate | immediate | immediate |
| 30                                                                               |        |           |           |           |
|                                                                                  |        |           |           |           |

3 SB

# Pseudo Instructions

Load / Store Operations

Туре

| Mnemonic              | Instruction                       | Base instruction(s)                                  |
|-----------------------|-----------------------------------|------------------------------------------------------|
| LI rd, imm12          | Load immediate (near)             | ADDI rd, zero, imm12                                 |
| LI rd, imm            | Load immediate (far)              | LUI rd, imm[31:12]<br>ADDI rd, rd, imm[11:0]         |
| LA rd, sym            | Load address (far)                | AUIPC rd, sym[31:12]<br>ADDI rd, rd, sym[11:0]       |
| MV rd, rs             | Copy register                     | ADDI rd, rs, 0                                       |
| NOT rd, rs            | One's complement                  | XORI rd, rs, -1                                      |
| NEG rd, rs            | Two's complement                  | SUB rd, zero, rs                                     |
| BGT rs1, rs2, offset  | Branch if rs1 > rs2               | BLT rs2, rs1, offset                                 |
| BLE rs1, rs2, offset  | Branch if rs1 ≤ rs2               | BGE rs2, rs1, offset                                 |
| BGTU rs1, rs2, offset | Branch if rs1 > rs2<br>(unsigned) | BLTU rs2, rs1, offset                                |
| BLEU rs1, rs2, offset | Branch if rs1 ≤ rs2<br>(unsigned) | BGEU rs2, rs1, offset                                |
| BEQZ rs1, offset      | Branch if rs1 = 0                 | BEQ rs1, zero, offset                                |
| BNEZ rs1, offset      | Branch if rs1 ≠ 0                 | BNE rs1, zero, offset                                |
| BGEZ rs1, offset      | Branch if rs1 ≥ 0                 | BGE rs1, zero, offset                                |
| BLEZ rs1, offset      | Branch if rs1 ≤ 0                 | BGE zero, rs1, offset                                |
| BGTZ rs1, offset      | Branch if rs1 > 0                 | BLT zero, rs1, offset                                |
| J offset              | Unconditional jump                | JAL zero, offset                                     |
| CALL offset12         | Call subroutine (near)            | JALR ra, ra, offset12                                |
| CALL offset           | Call subroutine (far)             | AUIPC ra, offset[31:12]<br>JALR ra, ra, offset[11:0] |
| RET                   | Return from subroutine            | JALR zero, 0(ra)                                     |
| NOP                   | No operation                      | ADDI zero, zero, 0                                   |
|                       |                                   |                                                      |

rs2(31:0) → mem[rs1 + imm12] rs2(15:0) → mem[rs1 + imm12] rs2(7:0) - mem[rs1 + imm12]

S S

Store word

SD rs2, imm12(rs1) SW rs2, imm12(rs1)

rs2 → mem[rs1 + inm12]

rd ← mem[rs1 + imm12] rd ← mem[rs1 + imm12] rd ← mem[rs1 + imm12]

Load word unsigned

\_ \_

Load halfword unsigned

Load byte unsigned Store doubleword

LBU rd, imm12(rs1)

rd ← mem[rs1 + imm12] rd ← mem[rs1 + imm12] rd ← mem[rs1 + imm12] rd ← mem[rs1 + imm12]

Load doubleword Instruction

Load halfword Load word

LW rd, imm12(rs1) LH rd, imm12(rs1) LB rd, imm12(rs1) LWU rd, imm12(rs1) LHU rd, imm12(rs1)

LD rd, imm12(rs1)

Load byte

### Register File

б 5 11 r15 ۲19 r23 r27

5 9

٤ ñ 6 13

9 5 8

r10 114 18 r22 r26 730

### Register Aliases

| ra - return address | <ul><li>sp - stack pointer</li><li>gp - global pointer</li></ul> | <b>tp</b> - thread pointer |    | ;  | <b>10 - 16</b> - Temporary re<br><b>50 - 511</b> - Saved by ca | a0 - a1 - Return value |    |
|---------------------|------------------------------------------------------------------|----------------------------|----|----|----------------------------------------------------------------|------------------------|----|
| В                   | t2                                                               | a1                         | a5 | 53 | 57                                                             | 511                    | t6 |
| g.                  | Ŧ                                                                | 9e                         | 94 | 52 | 95                                                             | 510                    | 45 |
| ē                   | 40                                                               | 51                         | a3 | a7 | 55                                                             | 65                     | 44 |
| zero                | tР                                                               | s@/fp                      | a2 | 96 | 54                                                             | 8%                     | £3 |
|                     |                                                                  |                            |    |    |                                                                |                        |    |

r17 r21 r25

r16 r20

۲12

<sup>- 17 -</sup> Function arguments

Ę

r28 r29

r24