# Comp 790-184: Hardware Security and Side-Channels

Lecture 4: Side-Channel Defenses

February 20, 2024 Andrew Kwong



THE UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

### Outline

- How to mitigate side-channel attacks
- •Non-interference property
- •Constant-time programming
- •Constant-time under speculation

### **Attack Examples**

Example #1: termination time vulnerability

```
def check_password(input):
```

```
size = len(password);
```

```
for i in range(0,size):
    if (input [i] == password[i]):
        return ("error");
```

return ("success");

#### Example #2: RSA cache vulnerability



#### Example #3: Meltdown

..... Ld1: uint8\_t secret = \*kernel\_address; Ld2: unit8\_t dummy = probe\_array[secret\*64];

# Who to blame? Who should fix the problem?



### **Software Developer's Problem**



### Software developers:

- Need to write software for devices with unknown design details.
- How can I know whether the program is secure running on different devices?







### Hardware Designer's Problem





Hardware designer:

- Need to design processors for arbitrary programs.
- How to describe what kind of programs can run securely on my device?

### **Example: Termination Time Vulnerability**

• How to fix it?

```
def check_password(input):
```

```
size = len(password);
```

```
for i in range(0,size):
    if (input [i] != password[i]):
        return ("error");
```

return ("success");

Make the computation time **independent** from the secret (password)

### **Non-Interference Example**



- Intuitively: not affecting
- Any sequence of **low** inputs will produce the same **low** outputs, regardless of what the **high** level inputs are.

• The definition of noninterference for a deterministic program P

 $\forall M1, M2, P$   $M1_{L} = M2_{L} \land (M1, P) \rightarrow^{*} M1' \land (M2, P) \rightarrow^{*} M2'$   $\implies M1_{L}' = M2_{L}'$ 

• The definition of noninterference for a deterministic program P

$$\forall M1, M2, P$$

$$M1_{L} = M2_{L} \land (M1, P) \xrightarrow{O1}{\rightarrow} M1' \land (M2, P) \xrightarrow{O2}{\rightarrow} M2'$$

$$\implies O1=O2$$

### What should be included in the observation trace?

## **Understand the Property**

|                      | V N           | M1, M2, P           |   |                                |
|----------------------|---------------|---------------------|---|--------------------------------|
| $M1_L = M2_L \wedge$ | (M1, P)       | <b>01</b><br>→* M1′ | ٨ | $(M2, P) \xrightarrow{02} M2'$ |
|                      | $\Rightarrow$ | 01=02               |   |                                |

### Consider input as part of M

- $\bullet$  What is  $M_L$  ?
- $\bullet$  What is  $M_{\rm H}$  ?
- What is 0 ?

```
def check_password(input):
    size = len(password);
    for i in range(0,size):
        if (input [i] == password[i]):
            return ("error");
    return ("success");
```

### **Constant-Time Programming**

• For any secret values, a program always takes the same amount of time for the same input when executing on the same machine, and this holds for arbitrary inputs.

### Data-oblivious/Constant-time programming

• How do we deal with conditional branches/jumps?

- How do we deal with memory accesses?
- How do we deal with arithmetic operations: division, shift/rotation, multiplication?

Your Code

Compiler

For details on real-world constant-time crypto, check this out: https://www.bearssl.org/constanttime.html

Hardware

```
def check_password(input):
```

```
size = len(password);
```

```
for i in range(0,size):
    if (input [i] != password[i]):
        return ("error");
```

```
return ("success");
```





from libsodium cryptographic library:

Compare two buffers x and y, if match, return 0, otherwise, return -1.

Examples from Cauligi et al. FaCT: A DSL for Timing-Sensitive Computation. PLDI'19

### **Eliminate Secret-dependent Branches**

### • An instruction: cmov\_

- Check the state of one or more of the status flags in the EFLAGS register (cmovz: moves when ZF=1)
- Perform a move operation if the flags are in a specified state
- Otherwise, a move is not performed and execution continues with the instruction following the cmov instruction

### **More Conditional Branches**



Potential problems:

- What if we have nested branches?
- What if when secret==0, f1 is not executable, e.g., causing page fault or divide by zero?
- What if f1 or f2 needs to write to memory, perform IO, make system calls?
- Hardware assumption: what if cmovz will be executed as soon as the flag is known (e.g., speculative execution)?

### **Memory Accesses**

# a = buffer[secret] for (i=0; i<size; i++)</pre> tmp = buffer[i]; xor secret, i cmovz a, tmp

- Performance overhead.
- Techniques such as ORAM can reduce the overhead when the buffer is large



### **An Optimization**

• We can reduce the redundant accesses by only accessing one byte in each cache line.

```
for (i=0; i<size; i++)
{
    tmp = buffer[i];
    xor secret, i
    cmovz a, tmp
}</pre>
```



# **OpenSSL Patches Against Timing Channel**



CacheBleed, an attack leaks SSL keys via L1 cache bank conflict.

25

Yarom et al. CacheBleed: A Timing Attack on OpenSSL Constant Time RSA. https://faculty.cc.gatech.edu/~genkin/cachebleed/index.html

### **Arithmetic Operations**

### Subnormal floating point numbers





### The Problem and A Solution



### **Constant-time ISA**

- Some efforts:
  - ARM Data Independent Timing (DIT)
  - Intel Data Operand Independent Timing (DOIT)

ARM DIT: https://developer.arm.com/documentation/ddi0601/2020-12/AArch64-Registers/DIT--Data-Independent-Timing Intel DOIT: https://<u>www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/best-practices/data-</u> operand-independent-timing-isa-guidance.html

### **Constant-time under Speculation**

• What problems arise?

### The Usage of Fences

### Meltdown

Ld1: uint8\_t secret = \*kernel\_address;

Ld2: unit8\_t dummy = probe\_array[secret\*64];

### Spectre v1

| Br:  | if | (x < size_array1) {           |
|------|----|-------------------------------|
| Ld1: |    | <pre>secret = array1[x]</pre> |
| Ld2: |    | y = array2[secret*64]         |
|      | }  |                               |

### Spectre v2

### Spectre V2 Vulnerability (Branch Target Injection)



### Software fix: retpoline



https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/technical-

documentation/retpoline-branch-target-injection-mitigation.htm



| Before<br>retpolin<br>e | jmp *%rax                                                                                                                                                                        |
|-------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| After<br>retpoline      | <ol> <li>call load_label</li> <li>capture_ret_spec:</li> <li>pause ; LFENCE</li> <li>jmp capture_ret_spec</li> <li>load_label:</li> <li>mov %rax, (%rsp)</li> <li>RET</li> </ol> |

### Adopted in Linux

### Intel elBRS

elBRS: Enhanced Indirect Branch Restricted Speculation Isolate BTB entries across privilege levels. Advertised as a mitigation against Spectre v2.



**Listing 3** Linux implementation for the Spectre v2 mitigation before version 5.14 on Intel processors depending on eIBRS hardware support. The shown example is taken from the indirect jump in charge to execute the correct syscall handler stored in the sys\_call\_table.

| 1           | do_sysc                 | all_64:                                  |
|-------------|-------------------------|------------------------------------------|
| 2           | 1                       |                                          |
| 3           | mov                     | <pre>rax, [sys_call_table + rax*8]</pre> |
| 4           | call                    | x86_indirect_thunk_rax                   |
| 1           | ;with e                 | IBRS support                             |
| 2           | x86_i                   | ndirect_thunk_rax:                       |
| 3           | jmp                     | rax                                      |
|             |                         | t TRRC surgest (ustralize)               |
|             |                         | t eIBRS support (retpoline)              |
| 2           | <b>x86_i</b> :          | ndirect_thunk_rax:                       |
|             |                         |                                          |
| 3           | call                    | В                                        |
|             | call<br>A: paus         |                                          |
|             |                         | e                                        |
| 4<br>5      | A: paus                 | e<br>ce                                  |
| 4<br>5<br>6 | A: paus<br>lfend<br>jmp | e<br>ce                                  |

Barberis et al. Branch History Injection: On the Effectiveness of Hardware Mitigations Against Cross-Privilege Spectre-v2 Attacks. USENIX'22 https://www.vusec.net/projects/bhi-spectre-bhb/

### **Vulnerabilities of Intel eIBRS**



What security property does elBRS provide exactly? What does the so-called "isolation" mean? Non-interference?

**Lesson:** should not base communication security properties based on gadget patterns. Instead, want clearly defined contracts



THE UNIVERSITY of NORTH CAROLINA at CHAPEL HILL