Changes

Template Attacks

6,064 bytes added, 13:44, 1 May 2018

Change header levels

# Apply the template to the attack traces. For each subkey, track which value is most likely to be the correct subkey. Continue until the key has been recovered.

== Signals, Noise, and Statistics ==

Before looking at the details of the template attack, it is important to understand the statistics concepts that are involved. A template is effectively a multivariate distribution that describes several key samples in the power traces. This section will describe what a multivariate distribution is and how it can be used in this context.

=== Noise Distributions ===

Electrical signals are inherently noisy. Any time we take a voltage measurement, we don't expect to see a perfect, constant level. For example, if we attached a multimeter to a 5 V source and took 4 measurements, we might expect to see a data set like (4.95, 5.01, 5.06, 4.98). One way of modelling this voltage source is

so we're very unlikely to see a reading of 7 V. We'll use this to our advantage in this attack: if <math>f(x)</math> is very small for one of our subkey guesses, it's probably a wrong guess.

=== Multivariate Statistics ===

The 1-variable Gaussian distribution works well for one measurement. What if we're working with more than one random variable?

Don't worry if this looks crazy - the SciPy package in Python will do all the heavy lifting for us. As with the single-variable distributions, we're going to use this to find how likely a certain observation is. In other words, if we put <math>k</math> points of our power trace into <math>\mathbf{x}</math> and we find that <math>f(\mathbf{x})</math> is very high, then we've probably found a good guess.

== Creating the Template ==

A template is a set of probability distributions that describe what the power traces look like for many different keys. Effectively, a template says: "If you're going to use key <math>k</math>, your power trace will look like the distribution <math>f_k(\mathbf{x})</math>". We can use this information to find subtle differences between power traces and to make very good key guesses for a single power trace.

=== Number of Traces ===

One of the downsides of template attacks is that they require a great number of traces to be preprocessed before the attack can begin. This is mainly for statistical reasons. In order to come up with a good distribution to model the power traces for ''every key'', we need a large number of traces for ''every key''. For example, if we're going to attack a single subkey of AES-128, then we need to create 256 power consumption models (one for every number from 0 to 255). In order to get enough data to make good models, we need tens of thousands of traces.

Note that we don't have to model every single key. One good alternative is to model a sensitive part of the algorithm, like the substitution box in AES. We can get away with a much smaller number of traces here; if we make a model for every possible Hamming weight, then we would end up with 17 9 models, which is an order of magnitude smaller. However, then we can't recover the key from a single attack trace - we need more information to recover the secret key.

=== Points of Interest ===

Our goal is to create a multivariate probability describing the power traces for every possible key. If we modeled the entire power trace this way (with, say, 3000 samples), then we would need a 3000-dimension distribution. This is insane, so we'll find an alternative.

These two points mean that we can usually live with a handful (3-5) of ''points of interest''. If we can pick out good points and write down a model using these samples, then we can use a 3D or 5D distribution - a great improvement over the original 3000D model.

==== Picking POIs ====There are several ways to pick the most important points in each of the traces. Generally, the aim is to find points that vary strongly between different operations (subkeys or Hamming weights). The simplest method -- the one that we'll use here -- is the '~~TODO; want to write tutorial first~~'sum of differences''method.

~~== Analyzing~~ The algorithm for the ~~Data ==~~sum of difference method is:~~How to~~ * For every operation <math>k</math> and every sample <math>i</math>, find the ~~covariance matrices~~average power <math>M_{k, i}</math>. For instance, if there are <math>T_k</math> traces where we performed operation <math>k</math>, then this average is

<math>M_{k, i} = \frac{1}{T_k} \sum_{j=1}^{T_k} t_{j, i}</math> * After finding all of the means, calculate all of their absolute pairwise differences. Add these up. This will give one "trace" which has peaks where the samples are usually different. The calculation looks like <math>D_{i} = \sum_{k_1, k_2} |M_{k_1, i} - M_{k_2, i}|</math> An example of this sum of differences is: [[File:Template-Sum-Of-Difference.png|600px]] * The peaks of <math>D_i</math> show the most important points, but we need to satisfy point 1 from above - we need to pick some peaks that aren't too close. One algorithm to do this is:# Pick the highest point in <math>D_i</math> and save this value of <math>i</math> as a point of interest. (ie: <math>i = argmax(D_i)</math>)# Throw out the nearest <math>N</math> points (where <math>N</math> is the minimum spacing between POIs).# Repeat until enough POIs have been selected. === Analyzing the Data ===Suppose that we've picked <math>I</math> points of interest, which are at samples <math>s_i</math> (<math>0 \le i < I</math>). Then, our goal is to find a mean and covariance matrix for every operation (every choice of subkey or intermediate Hamming weight). Let's say that there are <math>K</math> of these operations (maybe 256 subkeys or 9 possible Hamming weights). For now, we'll look at a single operation <math>k</math> (<math>0 \le k < K</math>). The steps are:* Find every power trace <math>t</math> that falls under the category of "operation <math>k</math>". (ex: find every power trace where we used a subkey of 0x01.) We'll say that there are <math>T_k</math> of these, so <math>t_{j, s_i}</math> means the value at trace <math>j</math> and POI <math>i</math>.* Find the average power <math>\mu_i</math> at every point of interest. This calculation will look like: <math>\mu_i = \frac{1}{T_k} \sum_{j=1}^{T_k} t_{j, s_i}</math> * Find the variance <math>v_i</math> of the power at each point of interest. One way of calculating this is: <math>v_i = \frac{1}{T_k} \sum_{j=1}^{T_k} (t_{j, s_i} - \mu_i)^2</math> * Find the covariance <math>c_{i, i^*}</math> between the power at every pair of POIs (<math>i</math> and <math>i^*</math>). One way of calculating this is: <math>c_{i, i^*} = \frac{1}{T_k} \sum_{j=1}^{T_k} (t_{j, s_i} - \mu_i) (t_{j, s_{i^*}} - \mu_{i^*})</math> * Put together the mean and covariance matrices as: <math>\mathbf{\mu} = \begin{bmatrix}\mu_1 \\\mu_2 \\\mu_3 \\\vdots\end{bmatrix}</math> <math>\mathbf{\Sigma} = \begin{bmatrix}v_1 & c_{1,2} & c_{1,3} & \dots \\c_{2,1} & v_2 & c_{2,3} & \dots \\c_{3,1} & c_{3,2} & v_3 & \dots \\\vdots & \vdots & \vdots & \ddots \end{bmatrix}</math> These steps must be done for every operation <math>k</math>. At the end of this preprocessing, we'll have <math>K</math> mean and covariance matrices, modelling each of the <math>K</math> different operations that the target can do. == Using the Template ==~~Apply~~ With a template in hand, we can finish our attack. For the attack, we need a smaller number of traces - we'll say that we have <math>A</math> traces. The sample values will be labeled <math>a_{j, s_i}</math> (<math>1 \le j \le A</math>). === Applying the Template ===First, let's apply the template ~~by measuring~~ to a single trace. Our job is to decide how likely all of our key guesses are. We need to do the following:* Put our trace values at the POIs into a vector. This vector will be <math>\mathbf{a_j} = \begin{bmatrix}a_{j,1} \\a_{j,2} \\a_{j,3} \\\vdots\end{bmatrix}</math> * Calculate the PDF for every key guess and save these for later. This might look like: <math>p_{k, j} = f_k(\mathbf{a_j})</math> * Repeat these two steps for all of the attack traces. This process gives us an array of <math>p_{k, j}</math>, which says: "Looking at trace <math>j</math>, how likely is it that key <math>k</math> is the correct one?" === Combining the Results ===The very last step is to combine our <math>p_{k, j}</math> values to decide which key is the best fit. The easiest way to do this is to combine them as <math>P_k = \prod_{j=1}^{A} p_{k,j}</math> For example, if we guessed that a subkey was equal to 0x00 and our PDF results in 3 traces were (0.9, 0.8, 0.95), then our overall result would be 0.684. Having one trace that doesn't match the template can cause this number to drop quickly, helping us eliminate the wrong guesses. Finally, we can pick the highest value of <math>P_k</math>, which tells us which guess fits the templates the best, and we're done! This method of combining our per-trace results can suffer from precision issues. After multiplying many large or small numbers together, we could end up with numbers that are too large or small to fit into a floating point variable. An easy fix is to work with logarithms. Instead of using <math>P_k</math> directly, we can calculate <math>\log P_k = \sum_{j=1}^A \log p_{k,j}</math> Comparing these logarithms will give us the same results without the precision issues. === Attack Time ===The number of attack traces required to complete a template attack can vary wildly depending on the quality of the preprocessing. As described in the original papers on template attacks (Chari, Rao, and Rohatgi), when using high-quality templates made from many traces, it is possible to attack a system with a single trace. In the ChipWhisperer tutorials, we'll save ourselves a bit of processing time and only record a few thousand traces for generating the templates, meaning that we'll need more attack tracesto recover the key from the victim.

~~== Finding the Key ==~~It may be interesting to experiment with different amounts of processing. Can you break AES in one trace? How ~~to use the~~ many template ~~to narrow down the possibilities~~traces does it take?

~~== Attack Time ==Shouldn't take long to attack~~[[Category:Theory]]