As of August 2020 the site you are on (wiki.newae.com) is deprecated, and content is now at rtfm.newae.com. |
Difference between revisions of "V4:Tutorial A7 Glitch Buffer Attacks"
(Created page with "{{Warningbox|This tutorial has been updated for ChipWhisperer 4.0.0 release. If you are using 3.x.x see the "V3" link in the sidebar.}} {{Infobox tutorial |name...") |
(No difference)
|
Latest revision as of 05:37, 29 July 2019
This tutorial has been updated for ChipWhisperer 4.0.0 release. If you are using 3.x.x see the "V3" link in the sidebar.
A7: Glitch Buffer Attacks | |
---|---|
Target Architecture | XMEGA/Arm |
Hardware Crypto | No |
Software Release | V3 / V4 / V5 |
This tutorial discusses a specific type of glitch attack. It shows how a simple printing loop can be abused, causing a target to print some otherwise private information. This attack will be used to recover a plaintext without any knowledge of the encryption scheme being used.
Contents
Background
This section introduces the attack concept by showing some real world examples of vulnerable firmware. Then, it describes the victim firmware that will be used in this tutorial.
Real Firmware
Typically, one of the slowest parts of an embedded system is its communication lines. It's pretty common to see a processor running in the MHz range with a serial connection of 96k baud. To make these two different speeds work together, embedded firmware usually fills up a buffer with data and lets a serial driver print on its own time. This setup means we can expect to see code like
for(int i = 0; i < number_of_bytes_to_print; i++) { print_one_byte_to_serial(buffer[i]); }
This is a pretty vulnerable piece of C. Imagine that we could sneak into the source code and change it to
for(int i = 0; i < really_big_number; i++) { print_one_byte_to_serial(buffer[i]); }
C compilers don't care that buffer[]
has a limited size - this loop will happily print every byte it comes across, which could include other variables, registers, and even source code. Although we probably don't have a good way of changing the source code on the fly, we do have glitches: a well-timed clock or power glitch could let us skip the i < number_of_bytes_to_print
check, which would have the same result.
How could this be applied? Imagine that we have an encrypted firmware image that we're going to transmit to a bootloader. A typical communication process might look like:
- We send the encrypted image ciphertexts over a serial connection
- The bootloader decrypts the ciphertexts and stores the result somewhere in memory
- The bootloader sends back a response over the serial port
We have a pretty straightforward attack for this type of bootloader. During the last step, we'll apply a glitch at precisely the right time, causing the bootloader to print all kinds of things to the serial connection. With some luck, we'll be able to find the decrypted plaintext somewhere in this memory dump.
Bootloader Setup
For this tutorial, a very simple bootloader using the SimpleSerial protocol has been set up. The source for this bootloader can be found in chipwhisperer/hardware/victims/firmware/bootloader-glitch
. The following commands are used:
-
pABCD\n
: Send an encrypted ciphertext to the bootloader. For example, this message is made up of the two bytesAB
andCD
. -
r0\n
: The reply from the bootloader. Acknowledges that a message was received. No other responses are used. -
x
: Clear the bootloader's received buffer. -
k
: Seex
.
The bootloader uses triple-ROT-13 encryption to encrypt/decrypt the messages. To help you send messages to the target, the script private/encrypt.py
prints the SimpleSerial command for a given fixed string. For example, the ciphertext for the string Don't forget to buy milk!
is
p516261276720736265747267206762206f686c207a76797821\n
This folder also contains a Makefile to create a hex file for use with the ChipWhisperer hardware. The build process is the same as the previous tutorials: run make
from the command line and make sure that everything built properly. If all goes well, the Makefile should print something like
---------------- Device: atxmega128d3 Program: 1706 bytes (1.2% Full) (.text + .data + .bootloader) Data: 248 bytes (3.0% Full) (.data + .bss + .noinit) Built for platform CW-Lite XMEGA -------- end --------
The Attack Plan
Since we have access to the source code, let's take our time and understand how our attack is going to work before we dive in.
The Sensitive Code
Inside bootloader.c
, there are two buffers that are used to store most of the important data. The source code shows:
#define DATA_BUFLEN 40 #define ASCII_BUFLEN (2 * DATA_BUFLEN) uint8_t ascii_buffer[ASCII_BUFLEN]; uint8_t data_buffer[DATA_BUFLEN];
This tells us that there will be two arrays stored somewhere in the target's memory. The AVR-GCC compiler doesn't usually try too hard to move these around, so we can expect to find them back-to-back in memory; that is, if we can read past the end of the ASCII buffer, we'll probably find the data buffer.
Next, the code used to print a response to the serial port is
if(state == RESPOND) { // Send the ascii buffer back trigger_high(); int i; for(i = 0; i < ascii_idx; i++) { putch(ascii_buffer[i]); } trigger_low(); state = IDLE; }
This looks very similar to the example code given in the previous section, so it should be vulnerable to a glitching attack. The goal is to cause the loop to continue past its regular limit: data_buffer[0]
is the same as ascii_buffer[80]
, so a successful glitch should dump the data buffer for us.
Disassembly
As a final step, let's check the assembly code to see exactly what we're trying to glitch through. In the same folder as the hex file you built, open the *.lss
file that corresponds to your target (for example, bootloader-CWLITEARM.lss
). This is called a listing file, and it contains a bunch of debug and assembly information. Most importantly, it will allow us to easily match the source code to what's in our hex file. Search the file for something close to the vulnerable loop, such as state == RESPOND
.
XMEGA Disassembly
After searching the file, you should see something like this:376: 89 91 ld r24, Y+ 378: 0e 94 06 02 call 0x40c ; 0x40c 37c: f0 e2 ldi r31, 0x20 ; 32 37e: cf 37 cpi r28, 0x7F ; 127 380: df 07 cpc r29, r31 382: c9 f7 brne .-14 ; 0x376
This is our printing loop in assembly. It has the following steps in it:
- Look at the address
Y
and put the contents intor24
. Increase the address stored inY
. (This is thei++
in the loop.) - Call the function in location
0x40c
. Presumably, this is the location of theputch()
function. - Compare
r28
andr29
to0x7F
and0x20
. Unless they're equal, go back to the top of the loop.
There's one quirk to notice in this code. In the C source, the for loop checks whether i < ascii_idx
. However, in the assembly code, the check is effectively whether i == ascii_idx
! This is even easier to glitch - as long as we can break past the brne
instruction once, we'll get to the data buffer.
Arm Disassembly
After searching the file, you should find:if(state == RESPOND)
{
// Send the ascii buffer back
trigger_high();
8000258: f000 f8da bl 8000410 <trigger_high>
int i;
for(i = 0; i < ascii_idx; i++)
{
putch(ascii_buffer[i]);
800025c: 7828 ldrb r0, [r5, #0]
800025e: f000 f8f7 bl 8000450 <putch>
8000262: 7868 ldrb r0, [r5, #1]
8000264: f000 f8f4 bl 8000450 <putch>
8000268: 78a8 ldrb r0, [r5, #2]
800026a: f000 f8f1 bl 8000450 <putch>
}
ascii_buffer[ascii_idx++] = '\n';
ascii_buffer[ascii_idx++] = '\n';
ascii_buffer[ascii_idx++] = '\n';
ascii_buffer[ascii_idx++] = '\n';
ascii_buffer[ascii_idx++] = '\n';
if(state == RESPOND)
{
// Send the ascii buffer back
trigger_high();
8000262: f000 f8d7 bl 8000414 <trigger_high>
int i;
for(i = 0; i < ascii_idx; i++)
8000266: 2400 movs r4, #0
{
putch(ascii_buffer[i]);
8000268: 5d28 ldrb r0, [r5, r4]
for(i = 0; i < ascii_idx; i++)
800026a: 3401 adds r4, #1
putch(ascii_buffer[i]);
800026c: f000 f8f2 bl 8000454 <putch>
for(i = 0; i < ascii_idx; i++)
8000270: 2c08 cmp r4, #8
8000272: d1f9 bne.n 8000268 <main+0x64>
- Load the character we want to print into
r0
(r5
contains the address ofascii_buffer
, whiler4
is our loop indexi
) - Add 1 to
r4
(i
) - Call
putch
- Compare
r4
to 8 (8 is always the value ofascii_idx
) - Branch back to the beginning of the loop if
r4
and 8 aren't equal
There's one quirk to notice in this code. In the C source, the for loop checks whether i < ascii_idx
. However, in the assembly code, the check is effectively whether i == ascii_idx
! This is even easier to glitch - as long as we can break past the brne
instruction once, we'll get to the data buffer.
Attack Script & Results
XMEGA Results
To speed up the tutorial, the script in #Appendix: Setup Script will open the ChipWhisperer Capture software and fill in all of the appropriate settings. Copy this code into a Python script and run it. Then, open the serial terminal and connect to the target, using the ASCII with Hex display mode. If everything is set up correctly, the Capture 1 button should cause the text r0
to appear in the terminal. This is the bootloader's response to a block of ciphertext.
Once this is set up, connect the glitch module's output to the target's clock. Do this by changing the Target HS IO-Out
to Glitch Module
. Try to Capture 1 again and watch the serial terminal. If you're lucky, a large amount of text will appear in this window:
r0 261276720736265747267206762206f686c207a767978210000000000000000000000000000000000000000000000 00000000000000Don't forget to buy milk!000000000000000000000000000000000000000000000000000000 000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 ... <many more lines omitted>
In the middle of this output, the plaintext is clearly visible! The data buffer has successfully been printed to the serial port, allowing us to see the decrypted text with no knowledge of the algorithm.
If you can't get this to work, remember that glitching is a very sensitive operation - one glitch timing will probably not work for every board on every day. Try using the glitch explorer to attack different Glitch Widths, Glitch Offsets, and Ext Trigger Offsets. The built-in Glitch Explorer will be very useful here - take a read through Tutorial A2 Introduction to Glitch Attacks (including Glitch Explorer) if you need a refresher.
Arm Results
If you added the same number of newlines as shown (5), you should be able to run the script at the end of this page from inside ChipWhisperer. Then, open the serial terminal and connect to the target, using the ASCII with Hex display mode. If everything is set up correctly, the Capture 1 button should cause the text r0
to appear in the terminal. This is the bootloader's response to a block of ciphertext.
Once this is set up, connect the glitch module's output to the target's clock. Do this by changing the Target HS IO-Out
to Glitch Module
. Try to Capture 1 again and watch the serial terminal. If you're lucky, a very large amount of text will appear in this window:
Near the beginning of this output, the plaintext is clearly visible! The data buffer has successfully been printed to the serial port, allowing us to see the decrypted text with no knowledge of the algorithm.
This might take a few attempts. If the target stops responding, you'll need to reset it manually. You may also want to setup a module to iterate over some glitch parameters (ext_offset should be the most important one) like in Tutorial A2 Introduction to Glitch Attacks (including Glitch Explorer).
Ideas
There's a lot more that can be done with this type of attack...
Safer Assembly Code
You may have been surprised to see that the assembly code uses a brne
instruction to check if the loop is finished - after all, we used a less-than comparison in our C source code! Try changing this line to use a more prohibitive loop. Here's how you might do this:
- Find a copy of the AVR assembler documentation and find a better instruction to use. You should be able to drop in the
brlt
instruction without much hassle. Figure out the new op-code for this instruction. - Open the
bootloader.hex
file and find the instruction you want to change. Swap in your new op-code. Note that each line of the hex file has a checksum at the end, so you'll need to calculate an updated checksum. - Upload your new bootloader onto the target and retry the attack. Does it still work? You might be able to see one extra byte from the ASCII buffer, but it will be very difficult to get to the data buffer. Can you change the glitch settings to complete the attack?
Volatile Variables
The reason why the original assembly code used the brne
instruction is because GCC is an optimizing compiler. The compiler doesn't directly translate the C source code into assembly instructions. Instead, it tries to determine if any of the code can be modified to make it faster or more compact. For instance, consider the loop
for(int i = 0; i < 10; i++) { if(i < 20) printf("%s", "Less"); else printf("%s", "Greater"); }
If you take a careful look at this code, you'll notice that the following loop will produce the same output:
for(int i = 0; i < 10; i++) { printf("%s", "Less"); }
However, this second loop is smaller (less code) and faster (no conditional jumps). This is the kind of optimization a compiler can make.
There are several ways we can stop the compiler from making some of these assumptions. One of these methods uses volatile variables, which look like
volatile int i;
A volatile variable is one that could change at any time. There could be many reasons why the value might change on us:
- Another thread might have access to the same memory location
- Another part of the computer might be able to change the variable's value (example: direct memory access)
- The variable might not actually be stored anywhere - it could be a read-only register in an embedded system
In any case, the volatile
keyword tells the compiler to make no guarantees about this variable.
Try changing the bootloader's source code to use a volatile variable inside the loop. What happens to the disassembly? Is the loop body longer? Connect to the target board and capture a power trace. Does it look different? You'll have to find a new Ext Trigger Offset for the glitch module. Can you still perform the attack? Is it feasible to use this fix to avoid glitching attacks?
Appendix: Setup Script
The following script is used to set up the ChipWhisperer-Lite with all of the necessary settings:
# GUI compatibility try: scope = self.scope target = self.target except NameError: pass scope.glitch.clk_src = 'clkgen' scope.glitch.ext_offset = 68 scope.glitch.width = 3.0 scope.glitch.offset = -5.0 scope.glitch.trigger_src = "ext_single" scope.gain.gain = 45 scope.adc.samples = 500 scope.adc.offset = 0 scope.adc.basic_mode = "rising_edge" scope.clock.clkgen_freq = 7370000 scope.clock.adc_src = "clkgen_x4" scope.trigger.triggers = "tio4" scope.io.tio1 = "serial_rx" scope.io.tio2 = "serial_tx" scope.io.hs2 = "glitch" target.go_cmd = "p516261276720736265747267206762206f686c207a76797821\\n" target.key_cmd = "" target.output_cmd = ""
Appendix: Arm Setup Script
# GUI compatibility try: scope = self.scope target = self.target except NameError: pass scope.glitch.clk_src = 'clkgen' scope.glitch.ext_offset = 16993 scope.glitch.width = -10 scope.glitch.offset = -40 scope.glitch.repeat = 2 scope.glitch.trigger_src = "ext_single" scope.gain.gain = 45 scope.adc.samples = 500 scope.adc.offset = 0 scope.adc.basic_mode = "rising_edge" scope.clock.clkgen_freq = 7370000 scope.clock.adc_src = "clkgen_x4" scope.trigger.triggers = "tio4" scope.io.tio1 = "serial_rx" scope.io.tio2 = "serial_tx" scope.io.hs2 = "glitch" target.go_cmd = "p516261276720736265747267206762206f686c207a76797821\\n" target.key_cmd = "" target.output_cmd = ""