16,235 bytes added,
03:05, 6 November 2017 This tutorial discusses a specific type of glitch attack. It shows how a simple printing loop can be abused, causing a target to print some otherwise private information. This attack will be used to recover a plaintext without any knowledge of the encryption scheme being used.
= Background =
This section introduces the attack concept by showing some real world examples of vulnerable firmware. Then, it describes the victim firmware that will be used in this tutorial.
== Real Firmware ==
Typically, one of the slowest parts of an embedded system is its communication lines. It's pretty common to see a processor running in the MHz range with a serial connection of 96k baud. To make these two different speeds work together, embedded firmware usually fills up a buffer with data and lets a serial driver print on its own time. This setup means we can expect to see code like
<pre>
for(int i = 0; i < number_of_bytes_to_print; i++)
{
print_one_byte_to_serial(buffer[i]);
}
</pre>
This is a pretty vulnerable piece of C. Imagine that we could sneak into the source code and change it to
<pre>
for(int i = 0; i < really_big_number; i++)
{
print_one_byte_to_serial(buffer[i]);
}
</pre>
C compilers don't care that <code>buffer[]</code> has a limited size - this loop will happily print every byte it comes across, which could include other variables, registers, and even source code. Although we probably don't have a good way of changing the source code on the fly, we do have glitches: a well-timed clock or power glitch could let us skip the <code>i < number_of_bytes_to_print</code> check, which would have the same result.
How could this be applied? Imagine that we have an encrypted firmware image that we're going to transmit to a bootloader. A typical communication process might look like:
# We send the encrypted image ciphertexts over a serial connection
# The bootloader decrypts the ciphertexts and stores the result somewhere in memory
# The bootloader sends back a response over the serial port
We have a pretty straightforward attack for this type of bootloader. During the last step, we'll apply a glitch at precisely the right time, causing the bootloader to print all kinds of things to the serial connection. With some luck, we'll be able to find the decrypted plaintext somewhere in this memory dump.
== Bootloader Setup ==
For this tutorial, a very simple bootloader using the SimpleSerial protocol has been set up. The source for this bootloader can be found in <code>chipwhisperer/hardware/victims/firmware/bootloader-glitch</code>. The following commands are used:
* <code>pABCD\n</code>: Send an encrypted ciphertext to the bootloader. For example, this message is made up of the two bytes <code>AB</code> and <code>CD</code>.
* <code>r0\n</code>: The reply from the bootloader. Acknowledges that a message was received. No other responses are used.
* <code>x</code>: Clear the bootloader's received buffer.
* <code>k</code>: See <code>x</code>.
The bootloader uses triple-ROT-13 encryption to encrypt/decrypt the messages. To help you send messages to the target, the script <code>private/encrypt.py</code> prints the SimpleSerial command for a given fixed string. For example, the ciphertext for the string <code>Don't forget to buy milk!</code> is
<pre>
p516261276720736265747267206762206f686c207a76797821\n
</pre>
This folder also contains a Makefile to create a hex file for use with the ChipWhisperer hardware. The build process is the same as the previous tutorials: run <code>make</code> from the command line and make sure that everything built properly. If all goes well, the Makefile should print something like
<pre>
----------------
Device: atxmega128d3
Program: 1706 bytes (1.2% Full)
(.text + .data + .bootloader)
Data: 248 bytes (3.0% Full)
(.data + .bss + .noinit)
Built for platform CW-Lite XMEGA
-------- end --------
</pre>
= The Attack Plan =
Since we have access to the source code, let's take our time and understand how our attack is going to work before we dive in.
== The Sensitive Code ==
Inside <code>bootloader.c</code>, there are two buffers that are used to store most of the important data. The source code shows:
<pre>
#define DATA_BUFLEN 40
#define ASCII_BUFLEN (2 * DATA_BUFLEN)
uint8_t ascii_buffer[ASCII_BUFLEN];
uint8_t data_buffer[DATA_BUFLEN];
</pre>
This tells us that there will be two arrays stored somewhere in the target's memory. The AVR-GCC compiler doesn't usually try too hard to move these around, so we can expect to find them back-to-back in memory; that is, if we can read past the end of the ASCII buffer, we'll probably find the data buffer.
Next, the code used to print a response to the serial port is
<pre>
if(state == RESPOND)
{
// Send the ascii buffer back
trigger_high();
int i;
for(i = 0; i < ascii_idx; i++)
{
putch(ascii_buffer[i]);
}
trigger_low();
state = IDLE;
}
</pre>
This looks very similar to the example code given in the previous section, so it should be vulnerable to a glitching attack. The goal is to cause the loop to continue past its regular limit: <code>data_buffer[0]</code> is the same as <code>ascii_buffer[80]</code>, so a successful glitch should dump the data buffer for us.
== Disassembly ==
As a final step, let's check the assembly code to see exactly what we're trying to glitch through. Run the command
<pre>
avr-objdump -m avr -D bootloader.hex > disassembly.txt
</pre>
and open <code>disassembly.txt</code>. If you know what to look for, you should find a snippet that looks something like:
<pre>
376: 89 91 ld r24, Y+
378: 0e 94 06 02 call 0x40c ; 0x40c
37c: f0 e2 ldi r31, 0x20 ; 32
37e: cf 37 cpi r28, 0x7F ; 127
380: df 07 cpc r29, r31
382: c9 f7 brne .-14 ; 0x376
</pre>
This is our printing loop in assembly. It has the following steps in it:
* Look at the address <code>Y</code> and put the contents into <code>r24</code>. Increase the address stored in <code>Y</code>. (This is the <code>i++</code> in the loop.)
* Call the function in location <code>0x40c</code>. Presumably, this is the location of the <code>putch()</code> function.
* Compare <code>r28</code> and <code>r29</code> to <code>0x7F</code> and <code>0x20</code>. Unless they're equal, go back to the top of the loop.
There's one quirk to notice in this code. In the C source, the for loop checks whether <code>i < ascii_idx</code>. However, in the assembly code, the check is effectively whether <code>i == ascii_idx</code>! This is even easier to glitch - as long as we can break past the <code>brne</code> instruction ''once'', we'll get to the data buffer.
= Attack Script & Results =
To speed up the tutorial, the script in [[#Appendix: Setup Script]] will open the ChipWhisperer Capture software and fill in all of the appropriate settings. Copy this code into a Python script and run it. Then, open the serial terminal and connect to the target, using the ASCII with Hex display mode. If everything is set up correctly, the Capture 1 button should cause the text <code>r0</code> to appear in the terminal. This is the bootloader's response to a block of ciphertext.
Once this is set up, connect the glitch module's output to the target's clock. Do this by changing the <code>Target HS IO-Out</code> to <code>Glitch Module</code>. Try to Capture 1 again and watch the serial terminal. If you're lucky, a large amount of text will appear in this window:
<pre>
r0
261276720736265747267206762206f686c207a767978210000000000000000000000000000000000000000000000
00000000000000Don't forget to buy milk!000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
...
<many more lines omitted>
</pre>
In the middle of this output, the plaintext is clearly visible! The data buffer has successfully been printed to the serial port, allowing us to see the decrypted text with no knowledge of the algorithm.
If you can't get this to work, remember that glitching is a very sensitive operation - one glitch timing will probably not work for every board on every day. Try using the glitch explorer to attack different ''Glitch Width''s, ''Glitch Offset''s, and ''Ext Trigger Offset''s. The built-in Glitch Explorer will be very useful here - take a read through [[Tutorial A2 Introduction to Glitch Attacks (including Glitch Explorer)]] if you need a refresher.
= Ideas =
There's a lot more that can be done with this type of attack...
== Safer Assembly Code ==
You may have been surprised to see that the assembly code uses a <code>brne</code> instruction to check if the loop is finished - after all, we used a less-than comparison in our C source code! Try changing this line to use a more prohibitive loop. Here's how you might do this:
# Find a copy of the [http://www.atmel.com/webdoc/avrassembler/index.html AVR assembler documentation] and find a better instruction to use. You should be able to drop in the <code>brlt</code> instruction without much hassle. Figure out the new op-code for this instruction.
# Open the <code>bootloader.hex</code> file and find the instruction you want to change. Swap in your new op-code. Note that each line of the hex file has a checksum at the end, so you'll need to [http://www.planetimming.com/checksum8.html calculate an updated checksum].
# Upload your new bootloader onto the target and retry the attack. Does it still work? You might be able to see one extra byte from the ASCII buffer, but it will be very difficult to get to the data buffer. Can you change the glitch settings to complete the attack?
== Volatile Variables ==
The reason why the original assembly code used the <code>brne</code> instruction is because GCC is an ''optimizing compiler''. The compiler doesn't directly translate the C source code into assembly instructions. Instead, it tries to determine if any of the code can be modified to make it faster or more compact. For instance, consider the loop
<pre>
for(int i = 0; i < 10; i++)
{
if(i < 20)
printf("%s", "Less");
else
printf("%s", "Greater");
}
</pre>
If you take a careful look at this code, you'll notice that the following loop will produce the same output:
<pre>
for(int i = 0; i < 10; i++)
{
printf("%s", "Less");
}
</pre>
However, this second loop is smaller (less code) and faster (no conditional jumps). This is the kind of optimization a compiler can make.
There are several ways we can stop the compiler from making some of these assumptions. One of these methods uses volatile variables, which look like
<pre>
volatile int i;
</pre>
A volatile variable is one that could change at any time. There could be many reasons why the value might change on us:
* Another thread might have access to the same memory location
* Another part of the computer might be able to change the variable's value (example: direct memory access)
* The variable might not actually be stored anywhere - it could be a read-only register in an embedded system
In any case, the <code>volatile</code> keyword tells the compiler to make no guarantees about this variable.
Try changing the bootloader's source code to use a volatile variable inside the loop. What happens to the disassembly? Is the loop body longer? Connect to the target board and capture a power trace. Does it look different? You'll have to find a new ''Ext Trigger Offset'' for the glitch module. Can you still perform the attack? Is it feasible to use this fix to avoid glitching attacks?
= Appendix: Setup Script =
The following script is used to set up the ChipWhisperer-Lite with all of the necessary settings:
<pre>
#!/usr/bin/python
# -*- coding: utf-8 -*-
#
# Copyright (c) 2013-2016, NewAE Technology Inc
# All rights reserved.
#
# Authors: Colin O'Flynn, Greg d'Eon
#
# Find this and more at newae.com - this file is part of the chipwhisperer
# project, http://www.assembla.com/spaces/chipwhisperer
#
# This file is part of chipwhisperer.
#
# chipwhisperer is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# chipwhisperer is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU Lesser General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with chipwhisperer. If not, see <http://www.gnu.org/licenses/>.
#=================================================
import sys
import chipwhisperer.capture.ui.CWCaptureGUI as cwc
from chipwhisperer.common.api.CWCoreAPI import CWCoreAPI
from chipwhisperer.common.scripts.base import UserScriptBase
from chipwhisperer.common.utils.parameter import Parameter
# Check for PySide
try:
from PySide.QtCore import *
from PySide.QtGui import *
except ImportError:
print "ERROR: PySide is required for this program"
sys.exit()
class UserScript(UserScriptBase):
def __init__(self, api):
super(UserScript, self).__init__(api)
def run(self):
#User commands here
print "***** Starting User Script *****"
# Set up board and target
self.api.setParameter(['Generic Settings', 'Scope Module', 'ChipWhisperer/OpenADC'])
self.api.setParameter(['Generic Settings', 'Trace Format', 'ChipWhisperer/Native'])
self.api.setParameter(['Generic Settings', 'Target Module', 'Simple Serial'])
self.api.connect()
# Fill in our other settings
lstexample = [['OpenADC', 'Gain Setting', 'Mode', 'high'],
['OpenADC', 'Gain Setting', 'Setting', 30],
['OpenADC', 'Trigger Setup', 'Mode', 'rising edge'],
['OpenADC', 'Trigger Setup', 'Total Samples', 500],
['OpenADC', 'Trigger Setup', 'Offset', 0],
['OpenADC', 'Clock Setup', 'CLKGEN Settings', 'Divide', 26],
['OpenADC', 'Clock Setup', 'CLKGEN Settings', 'Multiply', 2],
['OpenADC', 'Clock Setup', 'ADC Clock', 'Source', 'CLKGEN x4 via DCM'],
['OpenADC', 'Clock Setup', 'ADC Clock', 'Reset ADC DCM', None],
['CW Extra Settings', 'Target HS IO-Out', 'CLKGEN'],
['CW Extra Settings', 'Target IOn Pins', 'Target IO2', 'Serial TXD'],
['CW Extra Settings', 'Target IOn Pins', 'Target IO1', 'Serial RXD'],
['Glitch Module', 'Clock Source', 'CLKGEN'],
['Glitch Module', 'Glitch Width (as % of period)', 3.0],
['Glitch Module', 'Glitch Offset (as % of period)', -5.0],
['Glitch Module', 'Glitch Trigger', 'Ext Trigger:Single-Shot'],
['Glitch Module', 'Ext Trigger Offset', 68],
['Simple Serial', 'Go Command', u'p516261276720736265747267206762206f686c207a76797821\\n'],
['Simple Serial', 'Output Format', u''],
['Simple Serial', 'Load Key Command', u''],
]
# NOTE: For IV: offset = 70000
#Download all hardware setup parameters
for cmd in lstexample:
self.api.setParameter(cmd)
# Try a couple of captures
self.api.capture1()
print "***** Ending User Script *****"
if __name__ == '__main__':
# Run the program
app = cwc.makeApplication()
Parameter.usePyQtGraph = True
api = CWCoreAPI()
gui = cwc.CWCaptureGUI(api)
gui.show()
# Run our program and let the GUI take over
api.runScriptClass(UserScript)
sys.exit(app.exec_())
</pre>
{{Template:Tutorials}}
[[Category:Tutorials]]