As of August 2020 the site you are on (wiki.newae.com) is deprecated, and content is now at rtfm.newae.com.

Difference between revisions of "Tutorial A5-Bonus Breaking AES-256 Bootloader"

From ChipWhisperer Wiki
Jump to: navigation, search
(Attacking the IV: Started attack)
 
(18 intermediate revisions by 4 users not shown)
Line 1: Line 1:
This tutorial is an add-on to [[Tutorial A5 Breaking AES-256 Bootloader]]. It continues working on the same firmware, showing how to obtain the hidden IV and signature in the bootloader. '''It is not possible to do this bonus tutorial without first completing the regular tutorial''', so please finish Tutorial A5 first.
+
{{Warningbox|This tutorial has been updated for ChipWhisperer 5 release. If you are using 4.x.x or 3.x.x see the "V4" or "V3" link in the sidebar.}}
  
''This tutorial is under construction! Check back in a few days.''
+
{{Infobox tutorial
 +
|name                  = A5: Breaking AES-256 Bootloader
 +
|image                  =
 +
|caption                =
 +
|software versions      =
 +
|capture hardware      = CW-Lite, CW-Lite 2-Part, CW-Pro
 +
|Target Device          =
 +
|Target Architecture    = XMEGA/Arm
 +
|Hardware Crypto        = No
 +
|Purchase Hardware      =
 +
}}
  
= Background =
+
<!-- To edit this, edit Template:Tutorial_boilerplate -->
== AES in CBC Mode ==
+
{{Tutorial boilerplate}}
* Repeat of theory from tutorial
+
  
== The IV ==
+
* Jupyter file: '''PA_Multi_1-Breaking_AES-256_Bootloader.ipynb'''
* Suggest some ideas
+
== The Signature ==
+
* Timing attack
+
* Show firmware
+
  
= Exploring the Bootloader =
 
In this tutorial, we have the luxury of seeing the source code of the bootloader. This is generally not something we would have access to in the real world, so we'll try not to use it to cheat. (Peeking at <code>supersecret.h</code> counts as cheating.) Instead, we'll use the source to help us identify important parts of the power traces.
 
  
== Bootloader Source Code ==
+
== XMEGA Target ==
Inside the bootloader's main loop, it does three tasks that we're interested in:
+
* it decrypts the incoming ciphertext;
+
* it applies the IV to the decryption's result; and
+
* it checks for the signature in the resulting plaintext.
+
This snippet from <code>bootloader.c</code> shows all three of these tasks:
+
  
<pre>
+
See the following for using:
// Continue with decryption
+
* ChipWhisperer-Lite Classic (XMEGA)
trigger_high();               
+
* ChipWhisperer-Lite Capture + XMEGA Target on UFO Board (including NAE-SCAPACK-L1/L2 users)
aes256_decrypt_ecb(&ctx, tmp32);
+
* ChipWhisperer-Pro + XMEGA Target on UFO Board
trigger_low();
+
           
+
// Apply IV (first 16 bytes)
+
for (i = 0; i < 16; i++){
+
    tmp32[i] ^= iv[i];
+
}
+
  
//Save IV for next time from original ciphertext               
+
https://chipwhisperer.readthedocs.io/en/latest/tutorials/pa_multi_1-openadc-cwlitexmega.html#tutorial-pa-multi-1-openadc-cwlitexmega
for (i = 0; i < 16; i++){
+
    iv[i] = tmp32[i+16];
+
}
+
  
// Tell the user that the CRC check was okay
+
== ChipWhisperer-Lite ARM / STM32F3 Target ==
putch(COMM_OK);
+
putch(COMM_OK);
+
  
//Check the signature
+
See the following for using:
if ((tmp32[0] == SIGNATURE1) &&
+
* ChipWhisperer-Lite 32-bit (STM32F3 Target)
  (tmp32[1] == SIGNATURE2) &&
+
* ChipWhisperer-Lite Capture + STM32F3 Target on UFO Board (including NAE-SCAPACK-L1/L2 users)
  (tmp32[2] == SIGNATURE3) &&
+
* ChipWhisperer-Pro + STM32F3 Target on UFO Board
  (tmp32[3] == SIGNATURE4)){
+
 
+
  // Delay to emulate a write to flash memory
+
  _delay_ms(1);
+
+
</pre>
+
This gives us a pretty good idea of how the microcontroller is going to do its job. However, we can go one step further and find the exact assembly code that the target will execute. If you have Atmel Studio and its toolchain on your computer, you can get the assembly file from the command line with
+
<pre>
+
avr-objdump -m avr -D bootloader.hex > disassembly.txt
+
</pre>
+
This will convert the hex file into assembly code, making it more human-readable. The important part of this assembly code is:
+
<pre>
+
344: d3 01      movw r26, r6
+
346: 93 01      movw r18, r6
+
348: f6 01      movw r30, r12
+
34a: 80 81      ld r24, Z
+
34c: f9 01      movw r30, r18
+
34e: 91 91      ld r25, Z+
+
350: 9f 01      movw r18, r30
+
352: 89 27      eor r24, r25
+
354: f6 01      movw r30, r12
+
356: 81 93      st Z+, r24
+
358: 6f 01      movw r12, r30
+
35a: ee 15      cp r30, r14
+
35c: ff 05      cpc r31, r15
+
35e: a1 f7      brne .-24    ;  0x348
+
+
360: fe 01      movw r30, r28
+
362: b1 96      adiw r30, 0x21 ; 33
+
364: 81 91      ld r24, Z+
+
366: 8d 93      st X+, r24
+
368: e4 15      cp r30, r4
+
36a: f5 05      cpc r31, r5
+
36c: d9 f7      brne .-10    ;  0x364
+
  
36e: 84 ea      ldi r24, 0xA4 ; 164
+
https://chipwhisperer.readthedocs.io/en/latest/tutorials/pa_multi_1-openadc-cwlitearm.html#tutorial-pa-multi-1-openadc-cwlitearm
370: 0e 94 16 02 call 0x42c ;  0x42c
+
374: 84 ea      ldi r24, 0xA4 ; 164
+
376: 0e 94 16 02 call 0x42c ;  0x42c
+
  
37a: 89 89      ldd r24, Y+17 ; 0x11
+
== ChipWhisperer Nano Target ==
37c: 88 23      and r24, r24
+
37e: 09 f0      breq .+2      ;  0x382
+
380: 98 cf      rjmp .-208    ;  0x2b2
+
  
382: 8a 89      ldd r24, Y+18 ; 0x12
+
This tutorial is not available for the ChipWhisperer Nano.
384: 8b 3e      cpi r24, 0xEB ; 235
+
386: 09 f0      breq .+2      ;  0x38a
+
388: 94 cf      rjmp .-216    ;  0x2b2
+
 
+
38a: 8b 89      ldd r24, Y+19 ; 0x13
+
38c: 82 30      cpi r24, 0x02 ; 2
+
38e: 09 f0      breq .+2      ;  0x392
+
390: 90 cf      rjmp .-224    ;  0x2b2
+
 
+
392: 8c 89      ldd r24, Y+20 ; 0x14
+
394: 8d 31      cpi r24, 0x1D ; 29
+
396: 09 f0      breq .+2      ;  0x39a
+
398: 8c cf      rjmp .-232    ;  0x2b2
+
 
+
39a: 83 e3      ldi r24, 0x33 ; 51
+
39c: 97 e0      ldi r25, 0x07 ; 7
+
39e: 01 97      sbiw r24, 0x01 ; 1
+
3a0: f1 f7      brne .-4      ;  0x39e
+
3a2: 87 cf      rjmp .-242    ;  0x2b2
+
</pre>
+
 
+
We'll use both of the source files throughout the tutorial.
+
 
+
== Power Traces ==
+
After the bootloader is finished the decryption process, it executes a couple of distinct pieces of code:
+
* To apply the IV, it uses an XOR operation;
+
* To store the new IV, it copies the previous ciphertext into the IV array;
+
* It sends two bytes on the serial port;
+
* It checks the bytes of the signature one by one.
+
We should be able to recognize these four parts of the code in the power traces. Let's modify our capture routine to find them.
+
 
+
Re-run the capture script and change a few settings:
+
<ol>
+
<li> We'd like to skip over all of the decryption process. The source code around this point is:
+
<pre>
+
trigger_high();
+
aes256_decrypt_ecb(&ctx, tmp32); /* encrypting the data block */
+
trigger_low();
+
</pre>
+
so we can skip straight over the AES-256 function by triggering on a falling edge instead of a rising edge. Change this in the scope settings.
+
<li> We don't need as many samples now. Change the number of samples to 3000.
+
<li> If we decrypt multiple ciphertexts in a row, only the first one will use the secret IV - all of the others will use the previous ciphertext instead. To avoid this, we'll have to automatically reset the board.
+
<ol>
+
<li> In the ''General Settings'' tab, change the Auxiliary Module to ''Reset AVR/XMEGA via CW-Lite''.
+
<li> In the ''Aux Settings'' tab, change both delays to around 100 ms.
+
</ol>
+
<li> Capture one trace and make sure that everything works.
+
</ol>
+
If everything worked out, you should be able to see all of the code's features:
+
 
+
[[File:Tutorial-A5-Bonus-Trace-Notes.PNG]]
+
 
+
With all of these things clearly visible, we have a pretty good idea of how to attack the IV and the signature. We should be able to look at each of the XOR spikes to find each of the IV bytes - each byte is processed on its own. Then, the signature check uses a short-circuiting comparison: as soon as it finds a byte in error, it stops checking the remaining bytes. This type of check is susceptible to a timing attack.
+
 
+
Let's grab a lot of traces so that we don't have to come back later. Save the project somewhere memorable, set up the capture routine to record 1000 traces, hit ''Capture Many'', and grab a coffee.
+
 
+
= Attacking the IV =
+
We need to find the IV before we can look at the signature, so the first half of the attack will look at the IV bytes.
+
 
+
== Attack Theory ==
+
The bootloader applies the IV to the AES decryption result by calculating
+
 
+
<math>
+
\text{PT} = \text{DR} \oplus \text{IV}
+
</math>
+
 
+
where DR is the decrypted ciphertext, IV is the secret vector, and PT is the plaintext that the bootloader will use later. We only have access to one of these: since we know the AES-256 key, we can calculate DR.
+
 
+
Specifically, the assembly code to calculate the plaintext is the loop
+
<pre>
+
344: d3 01      movw r26, r6
+
346: 93 01      movw r18, r6
+
348: f6 01      movw r30, r12
+
34a: 80 81      ld r24, Z
+
34c: f9 01      movw r30, r18
+
34e: 91 91      ld r25, Z+
+
350: 9f 01      movw r18, r30
+
352: 89 27      eor r24, r25
+
354: f6 01      movw r30, r12
+
356: 81 93      st Z+, r24
+
358: 6f 01      movw r12, r30
+
35a: ee 15      cp r30, r14
+
35c: ff 05      cpc r31, r15
+
35e: a1 f7      brne .-24    ;  0x348
+
</pre>
+
This code includes two <code>ld</code> instructions, one <code>eor</code>, and one <code>st</code>: the DR and IV are loaded and XORed to get PT, which is then stored back where DR was. All of these instructions should be visible in the power traces.
+
 
+
This is enough information for us to attack a single bit of the IV. Suppose we only wanted to get the first bit (number 0) of the IV. We could do the following:
+
* Split all of the traces into two groups: those with DR[0] = 0, and those with DR[0] = 1.
+
* Calculate the average trace for both groups.
+
* Find the difference between the two averages. It should include a noticeable spike during the first iteration of the loop.
+
* Look at the direction of the spike to decide if the IV bit is 0 (<code>PT[0] = DR[0]</code>) or if the IV bit is 1 (<code>PT[0] = ~DR[0]</code>).
+
This is effectively a DPA attack on a single bit of the IV. We can repeat this attack 128 times to recover the entire IV.
+
 
+
== A 1-Bit Attack ==
+
Unfortunately, we can't use the ChipWhisperer Analyzer to attack this XOR function. Instead, we'll write our own Python code. One thing that we ''don't'' need to do is write our own AES-256 implementation: there's some perfectly fine code in the PyCrypto library. [https://pypi.python.org/pypi/pycrypto Install PyCrypto] and make sure you can use its functions:
+
<pre>
+
python
+
Python 2.7.10 (default, May 23 2015, 09:40:32) [MSC v.1500 32 bit (Intel)] on win32
+
Type "help", "copyright", "credits" or "license" for more information.
+
>>> from Crypto.Cipher import AES
+
>>> AES
+
<module 'Crypto.Cipher.AES' from 'C:\WinPython-32bit-2.7.10.3\python-2.7.10\lib\site-packages\Crypto\Cipher\AES.pyc'>
+
</pre>
+
 
+
Next, open a new Python script wherever you like and load your data that you recorded earlier. You might want to rename the files to make them easier to work with. It'll also be helpful to know how many traces we have and how long they are:
+
 
+
<pre>
+
# Load data
+
import numpy as np
+
 
+
traces = np.load(r'traces\traces.npy')
+
textin = np.load(r'traces\textin.npy')
+
numTraces = len(traces)
+
traceLen = len(traces[0])
+
 
+
print numTraces
+
print traceLen
+
</pre>
+
 
+
It's also a good idea to plot some traces and make sure they look okay:
+
 
+
<pre>
+
# Plot some traces
+
import matplotlib.pyplot as plt
+
for i in range(10):
+
    plt.plot(traces[i])
+
plt.show()
+
</pre>
+
 
+
Since we know the AES-256 key, we can decrypt all of this data and store it in a list of decryption results:
+
 
+
<pre>
+
# Decrypt ciphertext with the key that we now know
+
from Crypto.Cipher import AES
+
knownkey = [0x94, 0x28, 0x5D, 0x4D, 0x6D, 0xCF, 0xEC, 0x08, 0xD8, 0xAC, 0xDD, 0xF6, 0xBE, 0x25, 0xA4, 0x99,
+
            0xC4, 0xD9, 0xD0, 0x1E, 0xC3, 0x40, 0x7E, 0xD7, 0xD5, 0x28, 0xD4, 0x09, 0xE9, 0xF0, 0x88, 0xA1]
+
knownkey = str(bytearray(knownkey))
+
dr = []
+
aes = AES.new(knownkey, AES.MODE_ECB)
+
for i in range(numTraces):
+
    ct = str(bytearray(textin[i]))
+
    d = aes.decrypt(ct)
+
    d = [bytearray(pt)[i] for i in range(16)]
+
    dr.append(d)
+
print dr
+
</pre>
+
 
+
That's a lot of data to print! Now, let's split the traces into two groups by comparing bit 0 of the DR:
+
 
+
<pre>
+
groupedTraces = [[] for _ in range(2)]
+
for i in range(numTraces):
+
    bit0 = dr[i][0] & 0x01
+
    groupedTraces[bit0].append(traces[i])
+
groupedTraces = np.array(groupedTraces)
+
print len(groupedTraces[0])
+
</pre>
+
 
+
If you have 1000 traces, you should expect this to print a number around 500 - roughly half of the traces should fit into each group.
+
 
+
== The Other 127 ==
+
 
+
Steps:
+
* Making the attack feasible
+
** Capture a bunch (500?)
+
** Apply decryption
+
** Look at one bit
+
** Find means + plot
+
** Find differences + plot
+
* Automating the attack
+
** Finding the attack points
+
** Getting a single bit
+
** Building the IV bytes
+
* Full script in appendix
+
 
+
Example:
+
 
+
<pre>#Imports for IV Attack
+
from Crypto.Cipher import AES
+
 
+
def initPreprocessing(self):
+
    self.preProcessingResyncSAD0 = preprocessing.ResyncSAD.ResyncSAD(self.parent)
+
    self.preProcessingResyncSAD0.setEnabled(True)
+
    self.preProcessingResyncSAD0.setReference(rtraceno=0, refpoints=(6300,6800), inputwindow=(6000,7200))
+
    self.preProcessingResyncSAD1 = preprocessing.ResyncSAD.ResyncSAD(self.parent)
+
    self.preProcessingResyncSAD1.setEnabled(True)
+
    self.preProcessingResyncSAD1.setReference(rtraceno=0, refpoints=(4800,5100), inputwindow=(4700,5200))
+
    self.preProcessingList = [self.preProcessingResyncSAD0,self.preProcessingResyncSAD1,]
+
    return self.preProcessingList
+
 
+
class AESIVAttack(object):
+
  numSubKeys = 16
+
 
+
  @staticmethod
+
  def leakage(textin, textout, guess, bnum, setting, state):
+
      knownkey = [0x94, 0x28, 0x5D, 0x4D, 0x6D, 0xCF, 0xEC, 0x08, 0xD8, 0xAC, 0xDD, 0xF6, 0xBE, 0x25, 0xA4, 0x99,
+
                  0xC4, 0xD9, 0xD0, 0x1E, 0xC3, 0x40, 0x7E, 0xD7, 0xD5, 0x28, 0xD4, 0x09, 0xE9, 0xF0, 0x88, 0xA1]
+
      knownkey = str(bytearray(knownkey))
+
      ct = str(bytearray(textin))
+
 
+
      aes = AES.new(knownkey, AES.MODE_ECB)
+
      pt = aes.decrypt(ct)
+
      return getHW(bytearray(pt)[bnum] ^ guess)</pre>
+
 
+
= Appendix D AES-256 IV Attack Script =
+
 
+
'''NB: This script works for 0.10 release or later, see local copy in doc/html directory of chipwhisperer release if you need earlier versions'''
+
 
+
Full attack script, copy/paste into a file then add as active attack script:
+
 
+
<pre>#IV Attack Script
+
from chipwhisperer.common.autoscript import AutoScriptBase
+
#Imports from Preprocessing
+
import chipwhisperer.analyzer.preprocessing as preprocessing
+
#Imports from Capture
+
from chipwhisperer.analyzer.attacks.CPA import CPA
+
from chipwhisperer.analyzer.attacks.CPAProgressive import CPAProgressive
+
import chipwhisperer.analyzer.attacks.models.AES128_8bit
+
# Imports from utilList
+
 
+
# Imports for AES256 Attack
+
from chipwhisperer.analyzer.attacks.models.AES128_8bit import getHW
+
 
+
#Imports for IV Attack
+
from Crypto.Cipher import AES
+
 
+
class AESIVAttack(object):
+
  numSubKeys = 16
+
 
+
  @staticmethod
+
  def leakage(textin, textout, guess, bnum, setting, state):
+
      knownkey = [0x94, 0x28, 0x5D, 0x4D, 0x6D, 0xCF, 0xEC, 0x08, 0xD8, 0xAC, 0xDD, 0xF6, 0xBE, 0x25, 0xA4, 0x99,
+
                  0xC4, 0xD9, 0xD0, 0x1E, 0xC3, 0x40, 0x7E, 0xD7, 0xD5, 0x28, 0xD4, 0x09, 0xE9, 0xF0, 0x88, 0xA1]
+
      knownkey = str(bytearray(knownkey))
+
      ct = str(bytearray(textin))
+
 
+
      aes = AES.new(knownkey, AES.MODE_ECB)
+
      pt = aes.decrypt(ct)
+
      return getHW(bytearray(pt)[bnum] ^ guess)
+
 
+
class userScript(AutoScriptBase):
+
    preProcessingList = []
+
    def initProject(self):
+
        pass
+
 
+
    def initPreprocessing(self):
+
        self.preProcessingResyncSAD0 = preprocessing.ResyncSAD.ResyncSAD(self.parent)
+
        self.preProcessingResyncSAD0.setEnabled(True)
+
        self.preProcessingResyncSAD0.setReference(rtraceno=0, refpoints=(6300,6800), inputwindow=(6000,7200))
+
        self.preProcessingResyncSAD1 = preprocessing.ResyncSAD.ResyncSAD(self.parent)
+
        self.preProcessingResyncSAD1.setEnabled(True)
+
        self.preProcessingResyncSAD1.setReference(rtraceno=0, refpoints=(4800,5100), inputwindow=(4700,5200))
+
        self.preProcessingList = [self.preProcessingResyncSAD0,self.preProcessingResyncSAD1,]
+
        return self.preProcessingList
+
 
+
    def initAnalysis(self):
+
        self.attack = CPA(self.parent, console=self.console, showScriptParameter=self.showScriptParameter)
+
        self.attack.setAnalysisAlgorithm(CPAProgressive, AESIVAttack, None)
+
        self.attack.setTraceStart(0)
+
        self.attack.setTracesPerAttack(100)
+
        self.attack.setIterations(1)
+
        self.attack.setReportingInterval(25)
+
        self.attack.setTargetBytes([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15])
+
        self.attack.setTraceManager(self.traceManager())
+
        self.attack.setProject(self.project())
+
        self.attack.setPointRange((4800,6500))
+
        return self.attack
+
 
+
    def initReporting(self, results):
+
        results.setAttack(self.attack)
+
        results.setTraceManager(self.traceManager())
+
        self.results = results
+
 
+
    def doAnalysis(self):
+
        self.attack.doAttack()</pre>
+
 
+
= Attacking the Signature =
+

Latest revision as of 05:36, 29 July 2019

This tutorial has been updated for ChipWhisperer 5 release. If you are using 4.x.x or 3.x.x see the "V4" or "V3" link in the sidebar.

A5: Breaking AES-256 Bootloader
Target Architecture XMEGA/Arm
Hardware Crypto No
Software Release V3 / V4 / V5

This tutorial will introduce you to measuring the power consumption of a device under attack. It will demonstrate how you can view the difference between assembly instructions. In ChipWhisperer 5 Release, the software documentation is now held outside the wiki. See links below.

To see background on the tutorials see the Tutorial Introduction on ReadTheDocs, which explains what the links below mean. These wiki pages (that you are reading right now) only hold the hardware setup required, and you have to run the Tutorial via the Jupyter notebook itself. The links below take you to the expected Jupyter output from each tutorial, so you can compare your results to the expected/known-good results.

Running the tutorial uses the referenced Jupyter notebook file.

  • Jupyter file: PA_Multi_1-Breaking_AES-256_Bootloader.ipynb


XMEGA Target

See the following for using:

  • ChipWhisperer-Lite Classic (XMEGA)
  • ChipWhisperer-Lite Capture + XMEGA Target on UFO Board (including NAE-SCAPACK-L1/L2 users)
  • ChipWhisperer-Pro + XMEGA Target on UFO Board

https://chipwhisperer.readthedocs.io/en/latest/tutorials/pa_multi_1-openadc-cwlitexmega.html#tutorial-pa-multi-1-openadc-cwlitexmega

ChipWhisperer-Lite ARM / STM32F3 Target

See the following for using:

  • ChipWhisperer-Lite 32-bit (STM32F3 Target)
  • ChipWhisperer-Lite Capture + STM32F3 Target on UFO Board (including NAE-SCAPACK-L1/L2 users)
  • ChipWhisperer-Pro + STM32F3 Target on UFO Board

https://chipwhisperer.readthedocs.io/en/latest/tutorials/pa_multi_1-openadc-cwlitearm.html#tutorial-pa-multi-1-openadc-cwlitearm

ChipWhisperer Nano Target

This tutorial is not available for the ChipWhisperer Nano.