As of August 2020 the site you are on (wiki.newae.com) is deprecated, and content is now at rtfm.newae.com.

Changes

Jump to: navigation, search

Tutorial B11 Breaking RSA

3,719 bytes added, 02:22, 16 July 2017
Use of SAD Trigger in ChipWhisperer-Pro
== Hardware Setup ==
 
The hardware setup is as in previous tutorials. The following will use the XMEGA example target, allowing you to complete this tutorial on the ChipWhisperer-Lite, the ChipWhisprer-Lite 2-Part Version target, or the UFO Board with the XMEGA target board.
 
You will only need the MEASURE input for performing power analysis, we will not be using the GLITCH output.
== Building Example ==
 
The example code is present in <code>hardware\victims\firmware\simpleserial-rsa</code>. You can run the standard make command with your applicable platform:
 
make PLATFORM=CW303
 
Where the CW303 is the XMEGA target on the ChipWhisperer-Lite / UFO target. Note an existing binary is present in the repository, if using a release you can simply use that existing .hex-file.
 
=== Firmware Description ===
 
The example firmware file (<code>simpleserial-rsa.c</code>) pulls in a RSA implementation from avr-crypto-lib.
 
The main firmware file defines two functions:
 
simpleserial_addcmd('t', 0, real_dec);
simpleserial_addcmd('p', 16, get_pt);
 
The <code>real_dec</code> function performs a real RSA decryption. The input plaintext and key are fixed, and loaded internally in the software.
 
<syntaxhighlight lang="c">
/* Perform a real RSA decryption, be aware this is VERY SLOW on AVR/XMEGA. At 7.37MHz using the default
1024 byte key it takes about 687 seconds (over 10 mins). */
uint8_t real_dec(uint8_t * pt)
{
/* Load encrypted message */
load_bigint_from_os(&cp, ENCRYPTED, sizeof(ENCRYPTED));
 
//Do an encryption on constant data
trigger_high();
if (rsa_dec(&cp, &priv_key)){
putch('F');
}
trigger_low();
return 0;
}
</syntaxhighlight>
 
The VERY slow encryption can be seen by simply sending a <code>t</code> command. You will see a series of dots printed to the console as the RSA algorithm is running. Instead we will use the second function which performs a much smaller operation on a 16-byte key. The exact same code is used, only it runs on a MUCH smaller key that can be easily captured. The end of the tutorial will discuss how you could apply this to the full algorithm.
== Finding SPA Leakage ==
Assuming you have a working example, the next step is the easiest. We will record a single project with the following data:
# * 2x traces with secret key of <code>00 00 00 00 00 00 00 00 00 00 00 00 00 00 80 00</code># * 2x traces with secret key of <code>00 00 00 00 00 00 00 00 00 00 00 00 00 00 81 40</code># * 2x traces with secret key of <code>00 00 00 00 00 00 00 00 00 00 00 00 00 00 AB E2</code># * 2x traces with secret key of <code>00 00 00 00 00 00 00 00 00 00 00 00 00 00 AB E3</code>
We record 2x traces for each sequence to provide us with a 'reference' trace and another 'test' trace (in case we want to confirm a template match is working without using the exact same trace).
<li>
Set the number of traces per capture to 2:<br>
[[File:B11_traces2.png|400px]]
</li>
<li>
Save the project file as rsa_testrsa_test_2bytes.cwp.
</li>
<li>
<li>
Set the fixed plaintext to <code>00 00 00 00 00 00 00 00 00 00 00 00 00 00 AB E3</code>, press "Capture M".
</li>
<li>
Save the project.
</li>
<li>
Use the trace manager to check the acquisitions are as expected:<br>
[[File:B11_tracemanager.png]]
</li>
</ol>
diffs = []
for i in range(0, 2299923499):
diff = tm.getTrace(target_trace_number)[i:(i+len(rsa_one))] - rsa_one
Which should give you an output like the following (specific numbers will vary):
<syntaxahighlight> 1855 1275 1275 1275 1275 1275 1275 1286 1275 1275 1275 1528 1275 1275 1275</syntaxhighlight>
Not there is a pretty long delay in the first run through, but later runs have roughly a constant time. There is three possible delays used in later bits:
diffs = []
for i in range(0, 2299923499):
diff = tm.getTrace(target_trace_number)[i:(i+len(rsa_one))] - rsa_one
== Extending the Tutorial ==
The previous tutorial is a basic attack on the core RSA algorithm. There are several extensions of it you can try. As mentioned you can improve the automatic key recover algorithm. You can also try performing this attack on longer key lengths -- this is made much easier with the ChipWhisperer-Pro, as it can use "streaming mode" to recover an extremely long key.
You can also try performing this attack on longer key lengths -- this is made much easier with the ChipWhisperer-Pro, as it can use "streaming mode" to recover an extremely long key, or even capture the entire RSA algorithm. The ChipWhisperer-Lite & Pro both have a "downsample" capability to fit a longer capture into your buffer. You may need an external low-pass filter in some cases. The ChipWhisperer-Pro has a unique analog trigger feature. This can also be used to break the RSA algorithm by simply performing the pattern match in real-time, and measuring the time delay of triggger trigger locations. This is demonstrated in the next section.
=== Use of SAD Trigger in ChipWhisperer-Pro ===
# Perform an example capture to program the SAD block.
# Configure the trigger out to be routed to an I/O the "AUX Out" pin.# Set the trigger in the ChipWhisperer-Pro as coming from the SAD block (otherwise the external trigger out will only duplicate the trigger-in pin state).# Configure the SAD reference waveform as some unique sequence with the square/multiply function.
# Using an external device (logic analyzer, scope, etc) record the trigger pattern.
# From the trigger pattern, directly read off the RSA secret key.
 
As an example, using the above simplified example we can see the delay pattern in our picoscope software used to measure the timing of the output trigger, where the BLUE trace is the trigger out from the XMEGA (i.e., it is HIGH during the processing of data), and yellow is the "Trigger Out" from the ChipWhisperer-Pro, indicating where the SAD pattern match block has indicated a matching analog sequence:
 
[[File:B11_picoscope.png|800px]]
 
 
Note the trigger out is very short (one ADC cycle), so a reasonably fast capture speed may be needed to ensure no edges are lost, or a pulse stretcher used on the pulse output. The 1's in the above waveform are indicated by the longer delay between successive edges, as the square-multiply operation has a longer delay than a square only.
 
The ChipWhisperer-Pro trigger also makes it easier to attach the real RSA implementation, as it does not required storing an extremely long analog data trace and then postprocessing it.
 
{{Template:Tutorials}}
[[Category:Tutorials]]
Approved_users, bureaucrat, administrator
1,956
edits

Navigation menu