As of August 2020 the site you are on (wiki.newae.com) is deprecated, and content is now at rtfm.newae.com.

Changes

Jump to: navigation, search

Tutorial B2 Viewing Instruction Power Differences

2,271 bytes added, 16:18, 9 October 2018
no edit summary
}}
This tutorial will introduce you to measuring the power consumption of a device under attack. It will demonstrate how you can view the difference between a 'add' instruction and a 'mul' instruction.assembly instructions
== Prerequisites ==
<li>The ''ADC Freq'' should show 4x the clock speed of your device (typically 29.5MHz), and the ''DCM Locked'' checkbox __MUST__ be checked. If the ''DCM Locked'' checkbox is NOT checked, try hitting the ''Reset ADC DCM'' button again.</li>
<li><p>At this point you can hit the ''Capture 1'' button, and see if the system works! You should end up with a window looking like this:</p>
<p>[[File:05_Low_Gain.PNG|image|1250px1083x1083px]]</p>
<p>Whilst there is a waveform, you need to adjust the capture settings. There are two main settings of importance, the analog gain and number of samples to capture.</p></li>
[[File:06_high_gain.PNG|image|1250px1083x1083px]]</ol>
<ol start="16" style="list-style-type: decimal;">
=== Background on Setup (Arm) ===
While For the rest of this tutorial can , we'll be performed focusing on any supported targetthe STM32F3, results will vary between targets. which is the microcontroller on the CW303 Arm target (though other targets, for example, have pipelining and much smaller differences in power consumption between instructions compared to should demonstrate the XMEGA targetsame principles). Instead of trying to spot Since the differences ourselvesSTM32F3 is an Arm Cortex M4 device, we'll be using the persistence feature of ChipWhisperer Capture need to compare two tracesrefer to the [http://infocenter. The instructions we'll be focusing on here are <code>nop<arm.com/code> help/index.jsp?topic=/com.arm.doc.dui0553a/CHDJJGFB.html Cortex M4 Instruction Set] and <code>mul<the [http:/code>/infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0439b/CHDDIGAC.html Cortex M4 Instruction Set Summary].
FirstThe first thing we'll do is replace the <code>nop</code> instructions, clear since from it's documentation page we can see the display processor may not execute them. Instead, let's add some <code>add.w</code> (which is the 32 bit wide version of the current trace (if you have oneadd instruction):instructions. We'll be doing this since the <code>mul</code> instruction is always 32 bits wide and the 16 bit thumb instruction has a different power profile than the 32 bit Arm instruction. From the earlier links, we can see that both add and mul take 1 cycle each to complete.
[[File:Clear DisplayNow we should have 10 <code>add.png|frameless|336x336px]]w</code> instructions and 10 <code>mul</code> instructions:<syntaxhighlight lang="c">trigger_high();
Next, enable persistence:
[[File:Persistenceasm volatile("add.png|frameless|336x336px]]w r0, r0" "\n\t""add.w r0, r0" "\n\t""add.w r0, r0" "\n\t""add.w r0, r0" "\n\t""add.w r0, r0" "\n\t""add.w r0, r0" "\n\t""add.w r0, r0" "\n\t""add.w r0, r0" "\n\t""add.w r0, r0" "\n\t""add.w r0, r0" "\n\t"::);
Now hit the ''Run 1'' [[Fileasm volatile("mul r0, r0" "\n\t""mul r0, r0" "\n\t""mul r0, r0" "\n\t""mul r0, r0" "\n\t""mul r0, r0" "\n\t""mul r0, r0" "\n\t""mul r0, r0" "\n\t""mul r0, r0" "\n\t""mul r0, r0" "\n\t""mul r0, r0" "\n\t":Capture One Button.PNG|image]] button and capture a single trace. It's recommended that you only capture a single trace. You may also want to change the colour of the next trace to make differentiating them easier:);
trigger_low();</syntaxhighlight>Now hit the ''Run 1'' [[File:Set ColourCapture One Button.pngPNG|frameless|336x336pximage]] button and capture a single trace. You should now have something that looks like this:
You should now have something that looks like this[[File: B2 STM Addmul.PNG|frameless|1155x1155px]]
[[File:B2 cap1We can see the <code>add.PNG|frameless|1139x1139px]]w</code> and <code>mul</code> instructions near the beginning, staring about 10 samples in and ending about 90 samples in. There's not really any difference that we can see between the two, but we can see that they take up about 80 samples (20 microcontroller clock cycles) as we expect.  Next, let's insert some <code>udiv</code> instructions. From the Cortex M4 Instruction Set Summary, we can see that <code>udiv</code> (unsigned divide) instructions take between 2 and 12 cycles to complete (effectively depending on how big the numbers we're dividing are). We'll be dividing <code>r0</code> by <code>r0</code>, meaning we expect that every instruction after the first should take 2 cycles. It should have higher power consumption too, since dividing is typically a fairly complex operation:<syntaxhighlight lang="c">trigger_high();
==== Swapping NOP and MUL ====
Next, we're going to swap when the nop and mul instructions happen to see if we can identify which is which.:<syntaxhighlight lang="c">
asm volatile(
"add.w r0, r0" "\n\t""add.w r0, r0" "\n\t""add.w r0, r0" "\n\t""add.w r0, r0" "\n\t""add.w r0, r0" "\n\t""add.w r0, r0" "\n\t""add.w r0, r0" "\n\t""add.w r0, r0" "\n\t""add.w r0, r0" "\n\t""add.w r0, r0" "\n\t"::);asm volatile("mul r0,r1r0" "\n\t""mul r0,r1r0" "\n\t""mul r0,r1r0" "\n\t""mul r0,r1r0" "\n\t""mul r0,r1r0" "\n\t""mul r0,r1r0" "\n\t""mul r0,r1r0" "\n\t""mul r0,r1r0" "\n\t" "mul r0,r1r0" "\n\t""mul r0,r1r0" "\n\t"
::
);
asm volatile(
"nopudiv r0, r0" "\n\t""nopudiv r0, r0" "\n\t""nopudiv r0, r0" "\n\t""nopudiv r0, r0" "\n\t""nopudiv r0, r0" "\n\t""nopudiv r0, r0" "\n\t""nopudiv r0, r0" "\n\t""nopudiv r0, r0" "\n\t""nopudiv r0, r0" "\n\t""nopudiv r0, r0" "\n\t"
::
);
</syntaxhighlight>Build and program the new firmware onto the device. Then capture another trace. You should have something that looks like:
[[Filetrigger_low();</syntaxhighlight>Capture another trace and you should get something like:B2 cap2.PNG|frameless|1149x1149px]]
We can spot some areas with differences, but it's tough to see where the different instructions are happening. Let's zoom in (click and drag over the area you want to zoom into, right click > View All to go back to the original view)[[File:B2 STM Addmuldiv.PNG|frameless|1155x1155px]]
[[File:B2 cap3As we expected, we can see periods of high power consumption measuring about 80 samples in total right after the <code>add.PNG|frameless|1126x1126px]]w</code> and <code>mul</code> instructions. Interestingly, the <code>udiv</code> instructions seem to be split into 2 sets of operations. As a final check, we can add some more <code>mul</code> instructions and see the <code>udiv</code> instructions move down (and also break into more sections):
As you can see, there's still very little difference between the two, but the differences are there! [[File:B2 STM Addmulmuldiv.PNG|frameless|1155x1155px]]
== Clock Phase Adjustment ==
== Conclusion ==
In this tutorial you have learned how power analysis can tell you the operations being performed on a microcontroller. In future work we will move towards using this for breaking various forms of security on devices. In particular, [[Tutorial B3-1 Timing Analysis with Power for Password Bypass]] will examine how we can use this information to exploit a password check.
== Links ==
Approved_users, administrator
366
edits

Navigation menu