As of August 2020 the site you are on (wiki.newae.com) is deprecated, and content is now at rtfm.newae.com.

Changes

Jump to: navigation, search

Tutorial B2 Viewing Instruction Power Differences

2,479 bytes added, 16:18, 9 October 2018
no edit summary
}}
This tutorial will introduce you to measuring the power consumption of a device under attack. It will demonstrate how you can view the difference between a 'add' instruction and a 'mul' instruction.assembly instructions
== Prerequisites ==
<li>The ''ADC Freq'' should show 4x the clock speed of your device (typically 29.5MHz), and the ''DCM Locked'' checkbox __MUST__ be checked. If the ''DCM Locked'' checkbox is NOT checked, try hitting the ''Reset ADC DCM'' button again.</li>
<li><p>At this point you can hit the ''Capture 1'' button, and see if the system works! You should end up with a window looking like this:</p>
<p>[[File:05_Low_Gain.PNG|image|1250px1083x1083px]]</p>
<p>Whilst there is a waveform, you need to adjust the capture settings. There are two main settings of importance, the analog gain and number of samples to capture.</p></li>
[[File:06_high_gain.PNG|image|1250px1083x1083px]]</ol>
<ol start="16" style="list-style-type: decimal;">
=== Background on Setup (Arm) ===
For the rest of this tutorial, we'll be focusing on the STM32F3, which is the microcontroller on the CW303 Arm target (though other targets should demonstrate the same principles). Since the STM32F3 is an Arm Cortex M4 device, we'll need to refer to the [http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0553a/CHDJJGFB.html Cortex M4 Instruction Set ] and the [http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0439b/CHDDIGAC.html Cortex M4 Instruction Set Summary].
The first thing we'll do is replace the <code>nop</code> instructions, since from it's documentation page we can see the processor may not execute them. Instead, let's add some <code>add.w</code> (which is the 32 bit wide version of the add instruction) instructions. We'll be doing this since the <code>mul</code> instruction is always 32 bits wide and the 16 bit thumb instruction has a different power profile than the 32 bit Arm instruction. From the earlier links, we can see that both add and mul take 1 cycle each to complete.
Now hit the ''Run 1'' [[File:Capture One Buttonwe should have 10 <code>add.PNG|image]] button w</code> instructions and capture a single trace. It's recommended that you only capture a single trace. You may also want to change the colour of the next trace to make differentiating them easier10 <code>mul</code> instructions:<syntaxhighlight lang="c">trigger_high();
You should now have something that looks like this:
==== Swapping NOP and MUL ====
Next, we're going to swap when the nop and mul instructions happen to see if we can identify which is which.:<syntaxhighlight lang="c">
asm volatile(
"mul add.w r0,r1r0" "\n\t""mul add.w r0,r1r0" "\n\t""mul add.w r0,r1r0" "\n\t""mul add.w r0,r1r0" "\n\t""mul add.w r0,r1r0" "\n\t""mul add.w r0,r1r0" "\n\t""mul add.w r0,r1r0" "\n\t""mul add.w r0,r1r0" "\n\t" "mul add.w r0,r1r0" "\n\t""mul add.w r0,r1r0" "\n\t"
::
);
asm volatile(
"nopmul r0, r0" "\n\t""nopmul r0, r0" "\n\t""nopmul r0, r0" "\n\t""nopmul r0, r0" "\n\t""nopmul r0, r0" "\n\t""nopmul r0, r0" "\n\t""nopmul r0, r0" "\n\t""nopmul r0, r0" "\n\t""nopmul r0, r0" "\n\t""nopmul r0, r0" "\n\t"
::
);
</syntaxhighlight>Build and program the new firmware onto the device. Then capture another trace. You should have something that looks like:
trigger_low();</syntaxhighlight>Now hit the ''Run 1'' [[File:Capture One Button.PNG|image]] button and capture a single trace. You should now have something that looks like this: [[File:B2 cap2STM Addmul.PNG|frameless|1149x1149px1155x1155px]] We can see the <code>add.w</code> and <code>mul</code> instructions near the beginning, staring about 10 samples in and ending about 90 samples in. There's not really any difference that we can see between the two, but we can see that they take up about 80 samples (20 microcontroller clock cycles) as we expect.  Next, let's insert some <code>udiv</code> instructions. From the Cortex M4 Instruction Set Summary, we can see that <code>udiv</code> (unsigned divide) instructions take between 2 and 12 cycles to complete (effectively depending on how big the numbers we're dividing are). We'll be dividing <code>r0</code> by <code>r0</code>, meaning we expect that every instruction after the first should take 2 cycles. It should have higher power consumption too, since dividing is typically a fairly complex operation:<syntaxhighlight lang="c">trigger_high(); asm volatile("add.w r0, r0" "\n\t""add.w r0, r0" "\n\t""add.w r0, r0" "\n\t""add.w r0, r0" "\n\t""add.w r0, r0" "\n\t""add.w r0, r0" "\n\t""add.w r0, r0" "\n\t""add.w r0, r0" "\n\t""add.w r0, r0" "\n\t""add.w r0, r0" "\n\t"::);asm volatile("mul r0, r0" "\n\t""mul r0, r0" "\n\t""mul r0, r0" "\n\t""mul r0, r0" "\n\t""mul r0, r0" "\n\t""mul r0, r0" "\n\t""mul r0, r0" "\n\t""mul r0, r0" "\n\t""mul r0, r0" "\n\t""mul r0, r0" "\n\t"::); asm volatile("udiv r0, r0" "\n\t""udiv r0, r0" "\n\t""udiv r0, r0" "\n\t""udiv r0, r0" "\n\t""udiv r0, r0" "\n\t""udiv r0, r0" "\n\t""udiv r0, r0" "\n\t""udiv r0, r0" "\n\t""udiv r0, r0" "\n\t""udiv r0, r0" "\n\t"::); trigger_low();</syntaxhighlight>Capture another trace and you should get something like:
We can spot some areas with differences, but it's tough to see where the different instructions are happening. Let's zoom in (click and drag over the area you want to zoom into, right click > View All to go back to the original view)[[File:B2 STM Addmuldiv.PNG|frameless|1155x1155px]]
[[File:B2 cap3As we expected, we can see periods of high power consumption measuring about 80 samples in total right after the <code>add.PNG|frameless|1126x1126px]]w</code> and <code>mul</code> instructions. Interestingly, the <code>udiv</code> instructions seem to be split into 2 sets of operations. As a final check, we can add some more <code>mul</code> instructions and see the <code>udiv</code> instructions move down (and also break into more sections):
As you can see, there's still very little difference between the two, but the differences are there! [[File:B2 STM Addmulmuldiv.PNG|frameless|1155x1155px]]
== Clock Phase Adjustment ==
== Conclusion ==
In this tutorial you have learned how power analysis can tell you the operations being performed on a microcontroller. In future work we will move towards using this for breaking various forms of security on devices. In particular, [[Tutorial B3-1 Timing Analysis with Power for Password Bypass]] will examine how we can use this information to exploit a password check.
== Links ==
Approved_users, administrator
366
edits

Navigation menu