Changes

V4:Tutorial A7 Glitch Buffer Attacks

18,281 bytes added, 13:37, 29 July 2019
Created page with "{{Warningbox|This tutorial has been updated for ChipWhisperer 4.0.0 release. If you are using 3.x.x see the "V3" link in the sidebar.}} {{Infobox tutorial |name..."
{{Warningbox|This tutorial has been updated for ChipWhisperer 4.0.0 release. If you are using 3.x.x see the "V3" link in the sidebar.}}

{{Infobox tutorial
|name = A7: Glitch Buffer Attacks
|image =
|caption =
|software versions =
|capture hardware = CW-Lite
|Target Device =
|Target Architecture = XMEGA/Arm
|Hardware Crypto = No
|Purchase Hardware =
}}


This tutorial discusses a specific type of glitch attack. It shows how a simple printing loop can be abused, causing a target to print some otherwise private information. This attack will be used to recover a plaintext without any knowledge of the encryption scheme being used.

== Background ==
This section introduces the attack concept by showing some real world examples of vulnerable firmware. Then, it describes the victim firmware that will be used in this tutorial.

=== Real Firmware ===
Typically, one of the slowest parts of an embedded system is its communication lines. It's pretty common to see a processor running in the MHz range with a serial connection of 96k baud. To make these two different speeds work together, embedded firmware usually fills up a buffer with data and lets a serial driver print on its own time. This setup means we can expect to see code like
<pre>
for(int i = 0; i < number_of_bytes_to_print; i++)
{
print_one_byte_to_serial(buffer[i]);
}
</pre>

This is a pretty vulnerable piece of C. Imagine that we could sneak into the source code and change it to
<pre>
for(int i = 0; i < really_big_number; i++)
{
print_one_byte_to_serial(buffer[i]);
}
</pre>
C compilers don't care that <code>buffer[]</code> has a limited size - this loop will happily print every byte it comes across, which could include other variables, registers, and even source code. Although we probably don't have a good way of changing the source code on the fly, we do have glitches: a well-timed clock or power glitch could let us skip the <code>i < number_of_bytes_to_print</code> check, which would have the same result.

How could this be applied? Imagine that we have an encrypted firmware image that we're going to transmit to a bootloader. A typical communication process might look like:
# We send the encrypted image ciphertexts over a serial connection
# The bootloader decrypts the ciphertexts and stores the result somewhere in memory
# The bootloader sends back a response over the serial port
We have a pretty straightforward attack for this type of bootloader. During the last step, we'll apply a glitch at precisely the right time, causing the bootloader to print all kinds of things to the serial connection. With some luck, we'll be able to find the decrypted plaintext somewhere in this memory dump.

== Bootloader Setup ==
For this tutorial, a very simple bootloader using the SimpleSerial protocol has been set up. The source for this bootloader can be found in <code>chipwhisperer/hardware/victims/firmware/bootloader-glitch</code>. The following commands are used:
* <code>pABCD\n</code>: Send an encrypted ciphertext to the bootloader. For example, this message is made up of the two bytes <code>AB</code> and <code>CD</code>.
* <code>r0\n</code>: The reply from the bootloader. Acknowledges that a message was received. No other responses are used.
* <code>x</code>: Clear the bootloader's received buffer.
* <code>k</code>: See <code>x</code>.

The bootloader uses triple-ROT-13 encryption to encrypt/decrypt the messages. To help you send messages to the target, the script <code>private/encrypt.py</code> prints the SimpleSerial command for a given fixed string. For example, the ciphertext for the string <code>Don't forget to buy milk!</code> is
<pre>
p516261276720736265747267206762206f686c207a76797821\n
</pre>

This folder also contains a Makefile to create a hex file for use with the ChipWhisperer hardware. The build process is the same as the previous tutorials: run <code>make</code> from the command line and make sure that everything built properly. If all goes well, the Makefile should print something like
<pre>
----------------
Device: atxmega128d3

Program: 1706 bytes (1.2% Full)
(.text + .data + .bootloader)

Data: 248 bytes (3.0% Full)
(.data + .bss + .noinit)


Built for platform CW-Lite XMEGA

-------- end --------
</pre>

== The Attack Plan ==
Since we have access to the source code, let's take our time and understand how our attack is going to work before we dive in.

=== The Sensitive Code ===
Inside <code>bootloader.c</code>, there are two buffers that are used to store most of the important data. The source code shows:
<pre>
#define DATA_BUFLEN 40
#define ASCII_BUFLEN (2 * DATA_BUFLEN)

uint8_t ascii_buffer[ASCII_BUFLEN];
uint8_t data_buffer[DATA_BUFLEN];
</pre>
This tells us that there will be two arrays stored somewhere in the target's memory. The AVR-GCC compiler doesn't usually try too hard to move these around, so we can expect to find them back-to-back in memory; that is, if we can read past the end of the ASCII buffer, we'll probably find the data buffer.

Next, the code used to print a response to the serial port is
<pre>
if(state == RESPOND)
{
// Send the ascii buffer back
trigger_high();

int i;
for(i = 0; i < ascii_idx; i++)
{
putch(ascii_buffer[i]);
}
trigger_low();
state = IDLE;
}
</pre>
This looks very similar to the example code given in the previous section, so it should be vulnerable to a glitching attack. The goal is to cause the loop to continue past its regular limit: <code>data_buffer[0]</code> is the same as <code>ascii_buffer[80]</code>, so a successful glitch should dump the data buffer for us.

=== Disassembly ===
As a final step, let's check the assembly code to see exactly what we're trying to glitch through. In the same folder as the hex file you built, open the <code>*.lss</code> file that corresponds to your target (for example, <code>bootloader-CWLITEARM.lss</code>). This is called a listing file, and it contains a bunch of debug and assembly information. Most importantly, it will allow us to easily match the source code to what's in our hex file. Search the file for something close to the vulnerable loop, such as <code>state == RESPOND</code>.

==== XMEGA Disassembly ====
After searching the file, you should see something like this:<pre>
376: 89 91 ld r24, Y+
378: 0e 94 06 02 call 0x40c ; 0x40c
37c: f0 e2 ldi r31, 0x20 ; 32
37e: cf 37 cpi r28, 0x7F ; 127
380: df 07 cpc r29, r31
382: c9 f7 brne .-14 ; 0x376
</pre>
This is our printing loop in assembly. It has the following steps in it:
* Look at the address <code>Y</code> and put the contents into <code>r24</code>. Increase the address stored in <code>Y</code>. (This is the <code>i++</code> in the loop.)
* Call the function in location <code>0x40c</code>. Presumably, this is the location of the <code>putch()</code> function.
* Compare <code>r28</code> and <code>r29</code> to <code>0x7F</code> and <code>0x20</code>. Unless they're equal, go back to the top of the loop.
There's one quirk to notice in this code. In the C source, the for loop checks whether <code>i < ascii_idx</code>. However, in the assembly code, the check is effectively whether <code>i == ascii_idx</code>! This is even easier to glitch - as long as we can break past the <code>brne</code> instruction ''once'', we'll get to the data buffer.

==== Arm Disassembly ====
After searching the file, you should find:<syntaxhighlight lang="asm">
if(state == RESPOND)
{
// Send the ascii buffer back
trigger_high();
8000258: f000 f8da bl 8000410 <trigger_high>

int i;
for(i = 0; i < ascii_idx; i++)
{
putch(ascii_buffer[i]);
800025c: 7828 ldrb r0, [r5, #0]
800025e: f000 f8f7 bl 8000450 <putch>
8000262: 7868 ldrb r0, [r5, #1]
8000264: f000 f8f4 bl 8000450 <putch>
8000268: 78a8 ldrb r0, [r5, #2]
800026a: f000 f8f1 bl 8000450 <putch>
}
</syntaxhighlight>This is our printing "loop", but as we can see, it's not really a loop at all! As it turns out, the compiler has unrolled our loop, which typically speeds up execution at the expense of code size. It will be very difficult to prevent this with such a small loop, so instead we'll increase the loop size by adding some extra newlines at the end of the message:<syntaxhighlight lang="c">
ascii_buffer[ascii_idx++] = '\n';
ascii_buffer[ascii_idx++] = '\n';
ascii_buffer[ascii_idx++] = '\n';
ascii_buffer[ascii_idx++] = '\n';
ascii_buffer[ascii_idx++] = '\n';
</syntaxhighlight>If you want to use the script at the bottom, you should add the 5 newlines above, but other values will work fine with different glitch parameters. Recompile and take another look at the listing file. You should see that our printing loop is actually a loop now:<syntaxhighlight lang="c">
if(state == RESPOND)
{
// Send the ascii buffer back
trigger_high();
8000262: f000 f8d7 bl 8000414 <trigger_high>

int i;
for(i = 0; i < ascii_idx; i++)
8000266: 2400 movs r4, #0
{
putch(ascii_buffer[i]);
8000268: 5d28 ldrb r0, [r5, r4]
for(i = 0; i < ascii_idx; i++)
800026a: 3401 adds r4, #1
putch(ascii_buffer[i]);
800026c: f000 f8f2 bl 8000454 <putch>
for(i = 0; i < ascii_idx; i++)
8000270: 2c08 cmp r4, #8
8000272: d1f9 bne.n 8000268 <main+0x64>
</syntaxhighlight>We can break this assembly down into the following steps (starting with address 0x8000268):
* Load the character we want to print into <code>r0</code> (<code>r5</code> contains the address of <code>ascii_buffer</code>, while <code>r4</code> is our loop index <code>i</code>)
* Add 1 to <code>r4</code> (<code>i</code>)
* Call <code>putch</code>
* Compare <code>r4</code> to 8 (8 is always the value of <code>ascii_idx</code>)
* Branch back to the beginning of the loop if <code>r4</code> and 8 aren't equal
There's one quirk to notice in this code. In the C source, the for loop checks whether <code>i < ascii_idx</code>. However, in the assembly code, the check is effectively whether <code>i == ascii_idx</code>! This is even easier to glitch - as long as we can break past the <code>brne</code> instruction ''once'', we'll get to the data buffer.

== Attack Script & Results ==

=== XMEGA Results ===
To speed up the tutorial, the script in [[#Appendix: Setup Script]] will open the ChipWhisperer Capture software and fill in all of the appropriate settings. Copy this code into a Python script and run it. Then, open the serial terminal and connect to the target, using the ASCII with Hex display mode. If everything is set up correctly, the Capture 1 button should cause the text <code>r0</code> to appear in the terminal. This is the bootloader's response to a block of ciphertext.

Once this is set up, connect the glitch module's output to the target's clock. Do this by changing the <code>Target HS IO-Out</code> to <code>Glitch Module</code>. Try to Capture 1 again and watch the serial terminal. If you're lucky, a large amount of text will appear in this window:
<pre>
r0
261276720736265747267206762206f686c207a767978210000000000000000000000000000000000000000000000
00000000000000Don't forget to buy milk!000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
...
<many more lines omitted>
</pre>
In the middle of this output, the plaintext is clearly visible! The data buffer has successfully been printed to the serial port, allowing us to see the decrypted text with no knowledge of the algorithm.

If you can't get this to work, remember that glitching is a very sensitive operation - one glitch timing will probably not work for every board on every day. Try using the glitch explorer to attack different ''Glitch Width''s, ''Glitch Offset''s, and ''Ext Trigger Offset''s. The built-in Glitch Explorer will be very useful here - take a read through [[Tutorial A2 Introduction to Glitch Attacks (including Glitch Explorer)]] if you need a refresher.

=== Arm Results ===
If you added the same number of newlines as shown (5), you should be able to run the script at the end of this page from inside ChipWhisperer. Then, open the serial terminal and connect to the target, using the ASCII with Hex display mode. If everything is set up correctly, the Capture 1 button should cause the text <code>r0</code> to appear in the terminal. This is the bootloader's response to a block of ciphertext.

Once this is set up, connect the glitch module's output to the target's clock. Do this by changing the <code>Target HS IO-Out</code> to <code>Glitch Module</code>. Try to Capture 1 again and watch the serial terminal. If you're lucky, a very large amount of text will appear in this window:

[[File:MILK.PNG|frameless|994x994px]]

Near the beginning of this output, the plaintext is clearly visible! The data buffer has successfully been printed to the serial port, allowing us to see the decrypted text with no knowledge of the algorithm.

This might take a few attempts. If the target stops responding, you'll need to reset it manually. You may also want to setup a module to iterate over some glitch parameters (ext_offset should be the most important one) like in [[Tutorial A2 Introduction to Glitch Attacks (including Glitch Explorer)]].

== Ideas ==
There's a lot more that can be done with this type of attack...

=== Safer Assembly Code ===
You may have been surprised to see that the assembly code uses a <code>brne</code> instruction to check if the loop is finished - after all, we used a less-than comparison in our C source code! Try changing this line to use a more prohibitive loop. Here's how you might do this:
# Find a copy of the [http://www.atmel.com/webdoc/avrassembler/index.html AVR assembler documentation] and find a better instruction to use. You should be able to drop in the <code>brlt</code> instruction without much hassle. Figure out the new op-code for this instruction.
# Open the <code>bootloader.hex</code> file and find the instruction you want to change. Swap in your new op-code. Note that each line of the hex file has a checksum at the end, so you'll need to [http://www.planetimming.com/checksum8.html calculate an updated checksum].
# Upload your new bootloader onto the target and retry the attack. Does it still work? You might be able to see one extra byte from the ASCII buffer, but it will be very difficult to get to the data buffer. Can you change the glitch settings to complete the attack?

=== Volatile Variables ===
The reason why the original assembly code used the <code>brne</code> instruction is because GCC is an ''optimizing compiler''. The compiler doesn't directly translate the C source code into assembly instructions. Instead, it tries to determine if any of the code can be modified to make it faster or more compact. For instance, consider the loop
<pre>
for(int i = 0; i < 10; i++)
{
if(i < 20)
printf("%s", "Less");
else
printf("%s", "Greater");
}
</pre>
If you take a careful look at this code, you'll notice that the following loop will produce the same output:
<pre>
for(int i = 0; i < 10; i++)
{
printf("%s", "Less");
}
</pre>
However, this second loop is smaller (less code) and faster (no conditional jumps). This is the kind of optimization a compiler can make.

There are several ways we can stop the compiler from making some of these assumptions. One of these methods uses volatile variables, which look like
<pre>
volatile int i;
</pre>
A volatile variable is one that could change at any time. There could be many reasons why the value might change on us:
* Another thread might have access to the same memory location
* Another part of the computer might be able to change the variable's value (example: direct memory access)
* The variable might not actually be stored anywhere - it could be a read-only register in an embedded system
In any case, the <code>volatile</code> keyword tells the compiler to make no guarantees about this variable.

Try changing the bootloader's source code to use a volatile variable inside the loop. What happens to the disassembly? Is the loop body longer? Connect to the target board and capture a power trace. Does it look different? You'll have to find a new ''Ext Trigger Offset'' for the glitch module. Can you still perform the attack? Is it feasible to use this fix to avoid glitching attacks?

== Appendix: Setup Script ==
The following script is used to set up the ChipWhisperer-Lite with all of the necessary settings:
<pre>
# GUI compatibility
try:
scope = self.scope
target = self.target
except NameError:
pass

scope.glitch.clk_src = 'clkgen'
scope.glitch.ext_offset = 68
scope.glitch.width = 3.0
scope.glitch.offset = -5.0
scope.glitch.trigger_src = "ext_single"

scope.gain.gain = 45
scope.adc.samples = 500
scope.adc.offset = 0
scope.adc.basic_mode = "rising_edge"
scope.clock.clkgen_freq = 7370000
scope.clock.adc_src = "clkgen_x4"
scope.trigger.triggers = "tio4"
scope.io.tio1 = "serial_rx"
scope.io.tio2 = "serial_tx"
scope.io.hs2 = "glitch"

target.go_cmd = "p516261276720736265747267206762206f686c207a76797821\\n"
target.key_cmd = ""
target.output_cmd = ""
</pre>

== Appendix: Arm Setup Script ==
<pre>
# GUI compatibility
try:
scope = self.scope
target = self.target
except NameError:
pass

scope.glitch.clk_src = 'clkgen'
scope.glitch.ext_offset = 16993
scope.glitch.width = -10
scope.glitch.offset = -40
scope.glitch.repeat = 2
scope.glitch.trigger_src = "ext_single"

scope.gain.gain = 45
scope.adc.samples = 500
scope.adc.offset = 0
scope.adc.basic_mode = "rising_edge"
scope.clock.clkgen_freq = 7370000
scope.clock.adc_src = "clkgen_x4"
scope.trigger.triggers = "tio4"
scope.io.tio1 = "serial_rx"
scope.io.tio2 = "serial_tx"
scope.io.hs2 = "glitch"

target.go_cmd = "p516261276720736265747267206762206f686c207a76797821\\n"
target.key_cmd = ""
target.output_cmd = ""
</pre>

== Links ==

{{Template:Tutorials}}
[[Category:Tutorials]]
Approved_users, administrator
366
edits