Attacks on White Box Crypto - Hands On Multi Bit Attack

After performing a single-bit attack in a White Box scenario, the natural second step is to think about a multi-bit attack, similar to the Differential Power Analysis case.

To start with, let's use the same AES encryption program we used in the single-bit case.

To gather traces, we use another modification to the Tracer program. We need to modify the Tracer.cpp source code to allow a new option for printing out a bit-by-bit representation of the execution trace. Now the -t parameter takes in human, sqlite, and bits. There are a number of changes necessary, but the part printing out the trace is this:

static VOID RecordMemBytes(ADDRINT ip, CHAR r, ADDRINT addr, UINT8* memdump, INT32 size, BOOL isPrefetch)
{
    for (INT32 i = 0; i < size; i++)
      {
      TraceFile << " " << setfill('0') << setw(2) << static_cast<UINT32>(memdump[i]);
      }
}

So that a trace looks like this:

$ Tracer -o ls.log -t bytes -- ls
[*] Trace file ls.log opened for writing...

ls.log  Makefile  obj-ia32  obj-intel64  README.md  Tracer  Tracer.cpp tracer.log  version.mk version.sh
$ more ls.log
 01 00 00 00 00 00 00 00 28 00 00 00 00 00 00 00 152 222 255 255 255 127 00 00 128 154 84 229 255 127 00 00 90 97 00 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 160 222 255 255 255 127 00 
00 48 97 00 00 01 00 00 00 224 107 01 00 01 00 00 00 00 00 00 00 00 00 00 00 13 108 01 00 01 00 00 00 00 00 00 00 00 00 00 00 13 108 01 00 01 00 00 00 16 98 00 00 01 00 00 00 37 108 01 00 01 00 00 00 37 108 
01 00 01 00 00 00 00 00 00 00 00 00 00 00 224 107 01 00 01 00 00 00 48 97 00 00 01 00 00 00 160 222 255 255 255 127 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 250 154 84 229 255 127 00 00 00 00 00
 00 00 00 00 00 00 00 00 00 00 00 00 00 160 222 255 255 255 127 00 00 48 97 00 00 01 00 00
[ ... ]

So all memory reads and writes are now available as a bytestream of sorts, though the bytestream is written as space separated ASCII values between 0 and 255. We can load this back into Matlab/Octave with a script similar to the single bit case by first creating a shell script:

% print shell script to make trace files
fileID = fopen('tracescript.sh','w');
for i=0:199
  clearTextVector = (0:15) + mod(i,241); %create a vector of clear text bytes to encrypt
  commandString = sprintf('Tracer -o tracebytes%d.bin -t bytes -- ./aes-128-ecb.elf ',i);
  commandString = strcat(commandString, sprintf(' %x',clearTextVector)); % run whitebox with vector
  commandString = strcat(commandString, sprintf('\n')) % newline between commands
  fprintf(fileID, '%s', commandString);
end
fclose(fileID);

followed by executing the script:

$ tail -n4 tracescript.sh 
Tracer -o tracebits196.bin -t bytes -- ./aes-128-ecb.elf c4 c5 c6 c7 c8 c9 ca cb cc cd ce cf d0 d1 d2 d3
Tracer -o tracebits197.bin -t bytes -- ./aes-128-ecb.elf c5 c6 c7 c8 c9 ca cb cc cd ce cf d0 d1 d2 d3 d4
Tracer -o tracebits198.bin -t bytes -- ./aes-128-ecb.elf c6 c7 c8 c9 ca cb cc cd ce cf d0 d1 d2 d3 d4 d5
Tracer -o tracebits199.bin -t bytes -- ./aes-128-ecb.elf c7 c8 c9 ca cb cc cd ce cf d0 d1 d2 d3 d4 d5 d6

$ ./tracescript.sh 
[*] Trace file tracebytes0.bin opened for writing...


Testing AES128

ECB encrypt verbose:

plain text:
000102030405060708090a0b0c0d0e0f

ciphertext:
279fb74a7572135e8f9b8ef6d1eee003

[*] Trace file tracebytes1.bin opened for writing...


[ ... ]

followed by reading the results back:

for i=0:199
  fileNameString = sprintf('tracebytes%d.bin',i);
  traces(i+1,:) = importdata(fileNameString,' ');
end

To start with, the most straight forward approach is to closely follow the multi-bit DCA example and calculate a Hamming weight and Pearson correlation:


SBOX=[099 124 119 123 242 107 111 197 048 001 103 043 254 215 171 118 ... 202 130 201 125 250 089 071 240 173 212 162 175 156 164 114 192 ... 183 253 147 038 054 063 247 204 052 165 229 241 113 216 049 021 ... 004 199 035 195 024 150 005 154 007 018 128 226 235 039 178 117 ... 009 131 044 026 027 110 090 160 082 059 214 179 041 227 047 132 ... 083 209 000 237 032 252 177 091 106 203 190 057 074 076 088 207 ... 208 239 170 251 067 077 051 133 069 249 002 127 080 060 159 168 ... 081 163 064 143 146 157 056 245 188 182 218 033 016 255 243 210 ... 205 012 019 236 095 151 068 023 196 167 126 061 100 093 025 115 ... 096 129 079 220 034 042 144 136 070 238 184 020 222 094 011 219 ... 224 050 058 010 073 006 036 092 194 211 172 098 145 149 228 121 ... 231 200 055 109 141 213 078 169 108 086 244 234 101 122 174 008 ... 186 120 037 046 028 166 180 198 232 221 116 031 075 189 139 138 ... 112 062 181 102 072 003 246 014 097 053 087 185 134 193 029 158 ... 225 248 152 017 105 217 142 148 155 030 135 233 206 085 040 223 ... 140 161 137 013 191 230 066 104 065 153 045 015 176 084 187 022];
numberOfTraces = 200;traceLength = max(size(traces(1,:))); for i=0:(numberOfTraces-1) plaintext(i+1,:) = (0:15) + mod(i,241);end
for currentKeyByte=1:16 % for every byte in the key currentKeyByte for currentKeyByteGuess =0:255 % iterate through all candidate key bytes, 0x00 to 0xff currentKeyByteGuess xorResult = bitxor(plaintext(:,currentKeyByte),currentKeyByteGuess); % first operation is an XOR between the cleartext and the guessed key sboxLookup = SBOX(xorResult+1); prediction = sum( dec2bin(sboxLookup).' == '1' );% calculate Hamming Weight prediction for currentTimeIndex=1:traceLength dataPoints = sum( dec2bin(traces(:,currentTimeIndex)).' == '1' );% the data points for all the traces at the current time index % lets see how well the data points match with the predicted hamming weights % by calculating the Pearson Correlation Coefficient for the current data point % we want the maximum correlation, positive or negative, so we take the absolute value varDataPoints=var(dataPoints); if (varDataPoints != 0) pearsonCorrelation(currentKeyByte,currentKeyByteGuess+1,currentTimeIndex) = abs(cov(dataPoints,prediction)/(sqrt(varDataPoints) * sqrt(var(prediction)))); else pearsonCorrelation(currentKeyByte,currentKeyByteGuess+1,currentTimeIndex) = 0; end; end; F = squeeze(pearsonCorrelation(currentKeyByte,:,:)); [X, Y] = find(F==max(F(:))); printf('Best match so far for byte %d: %d 0x%x\n',currentKeyByte, X(1)-1,X(1)-1) end; F = squeeze(pearsonCorrelation(currentKeyByte,:,:)); % fix dimensionality of array [bestMatch, bestMatchTimeIndex] = find(F==max(F(:))); % find highest correlation solvedKey(currentKeyByte) = bestMatch(1) - 1; solvedKey fprintf('%x ', solvedKey); fprintf('\n');end;solvedKeyfprintf('%x ', solvedKey);fprintf('\n');

with the result being a greatly reduced number of required traces as compared to the multi-bit attack. It is worth pointing out that calculating the correlation makes progressively less sense as the number of data points goes towards zero....

Multi-bit attack results ( calculating Hamming value and Pearson Correlation):

Correct Key:             00 11 22 33 44 55 66 77 88 99 AA BB CC DD EE FF  Correct: 16

Predicted(200 traces ):  00 11 22 33 44 55 66 77 88 99 AA BB CC DD EE FF  Correct: 16
Predicted( 50 traces ):  00 11 22 33 44 55 66 77 88 99 AA BB CC DD EE FF  Correct: 16
Predicted( 20 traces ):  00 11 22 33 44 55 66 77 88 99 AA BB CC DD EE FF  Correct: 16
Predicted( 10 traces ):  00 11 22 33 44 55 66 77 88 99 AA BB CC DD EE FF  Correct: 16
Predicted(  9 traces ):  80 11 22 33 B0 55 F7 C4 88 99 AA BB CC A7 EE FF  Correct: 11
Predicted(  8 traces ):  80 11 22 33 FB D2 F7 C4 88 99 24 BB CC DD A7 FF  Correct: 09
Predicted(  7 traces ):  1C A4 A3 33 FB F8 3C AA 14 AC BD CE B8 A7 CB FF  Correct: 02
Predicted(  6 traces ):  CA C9 C8 A0 B3 F8 BC A7 C2 C1 C0 C7 52 EB F6 81  Correct: 00
Predicted(  5 traces ):  49 C2 4F C0 11 CF 1F 58 41 CA 47 C8 4D 82 D7 04  Correct: 00
Predicted(  4 traces ):  18 46 BD 81 1C C8 41 3B 10 4E B5 89 14 E3 66 1D  Correct: 00
Predicted(  3 traces ):  47 09 4B 3A 43 0D 52 00 45 01 43 32 41 05 11 10  Correct: 00
Predicted(  2 traces ):  1A 20 18 08 1E 24 1C 21 12 28 10 00 16 2C 14 01  Correct: 00
Predicted(  1 traces ):  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  Correct: 01

Note that with the single trace we get one correct value by chance, so that should be taken into account when looking at the number of correct bytes.

But the execution trace is a panopticon! We can do much better when we realize that we don't have to calculate a Hamming weight at all. We can get a much better model when comparing the actual predicted value with the actual computed value. Instead of calculating the Hamming weight of the values

prediction = sum( dec2bin(sboxLookup).' == '1' );

we just use the values

prediction = sboxLookup;

and instead of calculating the Hamming weight of the traced values,

dataPoints = sum( dec2bin(traces(:,currentTimeIndex)).' == '1' );

we just use the values

dataPoints = traces(:,currentTimeIndex);

This is better because in the Hamming weight model, numbers such as 0x01 and 0x04 have the same weight and so would be a match for one another. However, with an execution trace we can perform an exact match, and only exact values correspond to one another.

Multi-bit attack results (calculating direct match (not Hamming) value and Pearson Correlation):

Correct Key:             00 11 22 33 44 55 66 77 88 99 AA BB CC DD EE FF  Correct: 16

Predicted(200 traces ):  00 11 22 33 44 55 66 77 88 99 AA BB CC DD EE FF  Correct: 16
Predicted( 50 traces ):  00 11 22 33 44 55 66 77 88 99 AA BB CC DD EE FF  Correct: 16
Predicted( 20 traces ):  00 11 22 33 44 55 66 77 88 99 AA BB CC DD EE FF  Correct: 16
Predicted( 10 traces ):  00 11 22 33 44 55 66 77 88 99 AA BB CC DD EE FF  Correct: 16
Predicted(  6 traces ):  
Predicted(  5 traces ):  00 11 22 33 44 55 66 77 08 19 2A 3B CC DD EE FF  Correct: 12
Predicted(  4 traces ):  00 91 22 B3 04 55 66 77 08 99 2A BB 0C DD EE FF  Correct: 10
Predicted(  3 traces ):  82 B5 94 7E 86 B1 A6 77 8A BD 9C 76 8E B9 E4 FF  Correct: 02
Predicted(  2 traces ):  08 01 0A 00 08 05 08 04 00 04 02 08 00 00 00 00  Correct: 00
Predicted(  1 trace  ):  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  Correct: 01

Note that with the single trace we get one correct value by chance, so that should be taken into account when looking at the number of correct bytes.

We can do even better, if we realize that the Pearson Correlation was only a means to deal with the need for the Hamming weights. With an execution trace, we can just sum up the errors instead.

Multi-bit attack results (calculating direct match (not Hamming) value and sum of errors):

Correct Key:             00 11 22 33 44 55 66 77 88 99 AA BB CC DD EE FF  Correct: 16

Predicted(200 traces ):  00 11 22 33 44 55 66 77 88 99 AA BB CC DD EE FF  Correct: 16
Predicted( 50 traces ):  00 11 22 33 44 55 66 77 88 99 AA BB CC DD EE FF  Correct: 16
Predicted( 20 traces ):  00 11 22 33 44 55 66 77 88 99 AA BB CC DD EE FF  Correct: 16
Predicted( 10 traces ):  00 11 22 33 44 55 66 77 88 99 AA BB CC DD EE FF  Correct: 16
Predicted(  5 traces ):  00 11 22 33 44 55 66 77 08 19 2A 3B CC DD EE FF  Correct: 12
Predicted(  4 traces ):  00 11 22 33 04 55 66 77 08 19 2A 3B 0C DD EE FF  Correct: 10
Predicted(  3 traces ):  00 11 22 33 04 15 66 77 08 19 2A 3B 0C 1D EE FF  Correct: 08      
Predicted(  2 traces ):  00 11 02 33 04 15 06 77 08 19 0A 3B 0C 1D 0E 91  Correct: 04
Predicted(  1 trace  ):  E3 E2 E1 E0 E7 E6 E5 E4 EB EA E9 E8 EF EE ED EC  Correct: 00

Also, there is a large proportion of the execution traces that are identical in every run. This does not provide any useful information and can be removed to decrease the computational time:

% code to remove data that is identical in all traces
s = size(traces)
counter = 1;
for i=1:s(2)
    if (range(traces( :,  i))!= 0)
      traces2( :,  counter) = traces( :,  i);
      counter = counter + 1;
    endif
endfor
traces = traces2;

Finally, let's use the multi-bit attack on the same white box challenge that we used in the single-bit attack, the wbs_aes_kryptologik whitebox challenge available at https://github.com/SideChannelMarvels/Deadpool

Correct Key:             00 11 22 33 44 55 66 77 88 99 AA BB CC DD EE FF  Correct: 16

Predicted(  200 Traces): B0 B4 C6 B0 D7 A7 A6 A5 D1 A3 E6 A1 D0 D2 D2 D0  Correct: 00
Predicted( 2000 Traces): CF CB C5 B5 D6 7F 7F 2E D1 A1 E6 29 28 3C 34 38  Correct: 00
Predicted(20000 Traces): CF C4 7D 7C 16 7F 7F 7E D1 7D C9 29 28 3C 34 38  Correct: 00
Predicted(40000 Traces): CF C4 7D 7C 16 7F 7F 7E D1 7D C9 29 28 3C 34 38  Correct: 00

Clearly, the executable is hardened against byte-wise attacks, but still vulnerable to the single-bit attack. The byte-wise attack is not always superior!

To fight that, a higher-order attack is necessary, but that is a topic for another day.......

All the files used here are available at https://github.com/janbbeck/SideChannelMatlab

References:

1] Differential Computation Analysis: Hiding your White-Box Designs is Not Enough. https://eprint.iacr.org/2015/753.pdf

[2] Attacks and Countermeasures for White-box Designs. https://eprint.iacr.org/2018/049.pdf

[3] Computational Aspects of Correlation Power Analysis. https://eprint.iacr.org/2015/260.pdf

[4] Unboxing the White-Box. https://www.blackhat.com/docs/eu-15/materials/eu-15-Sanfelix-Unboxing-The-White-Box-Practical-Attacks-Against-Obfuscated-Ciphers-wp.pdf

[5] Hiding your White-Box Designs is Not Enough. https://www.troopers.de/media/filer_public/b8/4f/b84f0051-3992-4b34-8b7d-7f0be5f209e0/troopers16_teuwen_hiding_your_wb_designs.pdf

[6] Differential Power Analysis of HMAC SHA-2in the Hamming Weight Model. https://www.cryptoexperts.com/sbelaid/articleHMAC.pdf

[7] NSA Suite B Crypto, Keys, and Side Channel Attacks. https://www.rambus.com/wp-content/uploads/2015/08/2013-JunMarson-SuiteBAndSideChannel.pdf

[8] Introduction to Side-Channel Power Analysis (SCA, DPA). https://www.youtube.com/watch?v=OlX-p4AGhWs&t=2336s

[9] T.S. Messerges. Using second-order power analysis to attack DPA resistant soft-ware. InCryptographic Hardware and Embedded Systems

[10] https://github.com/SideChannelMarvels

[11] https://github.com/kokke/tiny-AES-c

[12] https://github.com/janbbeck/SideChannelMatlab