1
00:00:04,550 --> 00:00:07,649
>> Hello, I'm Catalin Marinas,

2
00:00:07,649 --> 00:00:11,130
Fellow Engineer at Linux
Kernel Lead at ARM Limited.

3
00:00:11,130 --> 00:00:13,050
While my day job is mostly

4
00:00:13,050 --> 00:00:15,465
Linux Kernel development
and maintenance,

5
00:00:15,465 --> 00:00:18,165
I have an interest in
formal methods and

6
00:00:18,165 --> 00:00:21,630
use them occasionally
to help with my tasks.

7
00:00:21,630 --> 00:00:24,540
This may be modeling the
spin lock implementations,

8
00:00:24,540 --> 00:00:26,580
some tricky concurrent algorithm,

9
00:00:26,580 --> 00:00:29,445
or just after hardware interaction.

10
00:00:29,445 --> 00:00:33,035
This doc however, is not about Linux,

11
00:00:33,035 --> 00:00:36,335
but something both hardware
and software people have

12
00:00:36,335 --> 00:00:39,545
been concerned with for
the past three years.

13
00:00:39,545 --> 00:00:43,470
Cache speculation side-channels
from a formal angle.

14
00:00:44,150 --> 00:00:46,620
From the Wikipedia definition,

15
00:00:46,620 --> 00:00:49,880
a side-channel attack is an
attack that gets information from

16
00:00:49,880 --> 00:00:52,310
the actual hardware
implementation rather

17
00:00:52,310 --> 00:00:55,160
than relying on a weakenising
the algorithm itself.

18
00:00:55,160 --> 00:00:57,920
For example, in the cache
side-channels case,

19
00:00:57,920 --> 00:01:00,560
the attacker can probe
the contents of the cache

20
00:01:00,560 --> 00:01:03,320
by timing the access to
certain memory locations.

21
00:01:03,320 --> 00:01:05,420
With cache speculation side-channels,

22
00:01:05,420 --> 00:01:08,390
the attacker uses the
imprint left in the cache

23
00:01:08,390 --> 00:01:12,520
by instructions executed
speculatively in the victim.

24
00:01:12,520 --> 00:01:17,720
The aim is to create a significantly
simpler abstract model of

25
00:01:17,720 --> 00:01:20,510
CPU speculative execution and

26
00:01:20,510 --> 00:01:23,990
define its confidentiality
and integrity properties.

27
00:01:23,990 --> 00:01:25,460
We then want to check whether

28
00:01:25,460 --> 00:01:27,680
the security properties are upheld by

29
00:01:27,680 --> 00:01:31,925
the CPU specification for any
reasonable code sequence.

30
00:01:31,925 --> 00:01:35,120
Such model allows us to explore

31
00:01:35,120 --> 00:01:37,280
related vulnerabilities
and workarounds at

32
00:01:37,280 --> 00:01:40,055
an abstract level and
in a unified way.

33
00:01:40,055 --> 00:01:43,055
As we'll see, the model can explain

34
00:01:43,055 --> 00:01:46,810
the current Spectre and
Meltdown vulnerabilities.

35
00:01:46,810 --> 00:01:50,279
Let's start with the
Spectre variant 1 example

36
00:01:50,279 --> 00:01:53,505
taken from our speculation
side-channels whitepaper.

37
00:01:53,505 --> 00:01:55,160
This is a code sequence of

38
00:01:55,160 --> 00:01:58,190
64-bit ARM assembly running
in a high privilege mode,

39
00:01:58,190 --> 00:02:00,835
for example, an
operating system kernel.

40
00:02:00,835 --> 00:02:05,420
The input provided by a low
privilege user has heard in the X0

41
00:02:05,420 --> 00:02:10,385
register and represents an untrusted
offsetting to a kernel array.

42
00:02:10,385 --> 00:02:13,610
If such offset is off the bonds,

43
00:02:13,610 --> 00:02:16,700
the algorithm skips the
kernel array access

44
00:02:16,700 --> 00:02:19,310
with a conditional
branch instruction.

45
00:02:19,310 --> 00:02:21,860
If the decision on the branch

46
00:02:21,860 --> 00:02:24,425
inside the CPU if
delayed, for example,

47
00:02:24,425 --> 00:02:28,025
because the extra upper-bound
hasn't been loaded yet,

48
00:02:28,025 --> 00:02:31,340
the CPU may speculatively continue

49
00:02:31,340 --> 00:02:36,540
the execution of the upper
instructions in this sequence.

50
00:02:36,540 --> 00:02:39,135
With X0 or the bonds,

51
00:02:39,135 --> 00:02:41,980
the first LDRB
instruction will access

52
00:02:41,980 --> 00:02:45,190
potentially secret information
belonging to the kernel.

53
00:02:45,190 --> 00:02:50,380
Subsequent arithmetic operations
on this value and the load from

54
00:02:50,380 --> 00:02:53,140
a user address at an
offset derived from

55
00:02:53,140 --> 00:02:57,050
this secret data will leave
a trace in the cache.

56
00:02:57,050 --> 00:03:01,915
While the register state under
speculation is discarded,

57
00:03:01,915 --> 00:03:04,200
the attacker can still probably cast

58
00:03:04,200 --> 00:03:07,400
data by timing the memory access.

59
00:03:07,850 --> 00:03:09,910
Let's now dive into

60
00:03:09,910 --> 00:03:12,630
the abstract CPU model and
define the system state.

61
00:03:12,630 --> 00:03:15,070
Rather than using 64-bit values and

62
00:03:15,070 --> 00:03:18,415
the arithmetics for which the
model checker won't scale,

63
00:03:18,415 --> 00:03:21,115
we go for sets of model values,

64
00:03:21,115 --> 00:03:23,945
we define symbols for
low and high addresses,

65
00:03:23,945 --> 00:03:28,200
data values, and the set of
names for the CPU registers.

66
00:03:28,200 --> 00:03:31,730
Since we don't have actual
numbers and arithmetics,

67
00:03:31,730 --> 00:03:34,700
we need to define a set of
operations on these values.

68
00:03:34,700 --> 00:03:36,560
An operation is a function that

69
00:03:36,560 --> 00:03:39,005
takes two values and
gives back a new one.

70
00:03:39,005 --> 00:03:41,090
Optables here is the set of

71
00:03:41,090 --> 00:03:45,480
all such functions over
the values we defined.

72
00:03:45,680 --> 00:03:49,880
The system state consists
of a privilege mode,

73
00:03:49,880 --> 00:03:53,300
registers mapping names
to values we defined,

74
00:03:53,300 --> 00:03:56,165
memory mapping addresses to values,

75
00:03:56,165 --> 00:03:59,200
and the cached stateable addresses.

76
00:03:59,200 --> 00:04:01,490
We don't need to model the caches and

77
00:04:01,490 --> 00:04:05,375
associative array for
our security properties,

78
00:04:05,375 --> 00:04:07,580
we are only interested in whether

79
00:04:07,580 --> 00:04:11,080
a memory location is cached or not.

80
00:04:11,080 --> 00:04:14,330
Not that we don't include
the program counter,

81
00:04:14,330 --> 00:04:19,200
although it can be part of
the regs set of model values.

82
00:04:20,380 --> 00:04:23,180
With the system variables defined,

83
00:04:23,180 --> 00:04:26,000
we need a set of actions to
transition between states.

84
00:04:26,000 --> 00:04:27,680
We will use tuples of

85
00:04:27,680 --> 00:04:29,570
instruction mnemonics and arguments

86
00:04:29,570 --> 00:04:31,700
for modeling the program execution.

87
00:04:31,700 --> 00:04:34,580
We have a Havoc subset
of instructions that

88
00:04:34,580 --> 00:04:37,630
set the register state
to a specific value.

89
00:04:37,630 --> 00:04:42,085
The move instructions copy a
value from a register to another.

90
00:04:42,085 --> 00:04:45,860
The memory accesses
are only permitted if

91
00:04:45,860 --> 00:04:47,720
the current privilege mode allows

92
00:04:47,720 --> 00:04:50,525
loads and stores to
specific memory locations.

93
00:04:50,525 --> 00:04:53,690
For example, the AccessOK
operator will not

94
00:04:53,690 --> 00:04:58,180
allow a low privilege mode to
read or write high addresses.

95
00:04:58,180 --> 00:05:04,385
The OP subset contains all possible
instructions that take values

96
00:05:04,385 --> 00:05:07,100
from two source registers
and combine them

97
00:05:07,100 --> 00:05:10,525
into a new value stored in
a destination register.

98
00:05:10,525 --> 00:05:12,650
To switch the system state between

99
00:05:12,650 --> 00:05:15,500
low and high privilege
modes, for example,

100
00:05:15,500 --> 00:05:18,010
our system call and the
corresponding return,

101
00:05:18,010 --> 00:05:24,610
we introduce a set of two
instructions, HCALL and LRET.

102
00:05:24,610 --> 00:05:30,755
The set of all valid instructions
sees the union of these subsets.

103
00:05:30,755 --> 00:05:34,100
The execute instruction
dispatcher is responsible for

104
00:05:34,100 --> 00:05:36,620
interpreting the commands and

105
00:05:36,620 --> 00:05:39,570
changing the system's
date accordingly.

106
00:05:40,420 --> 00:05:43,370
The program execution is modeled as

107
00:05:43,370 --> 00:05:46,410
a succession of steps starting
from an initial state.

108
00:05:46,410 --> 00:05:49,700
To reduce the number of initial
states for the model checker,

109
00:05:49,700 --> 00:05:54,635
we chose registers initialized
to zero and cache is empty.

110
00:05:54,635 --> 00:05:57,590
The next step chooses an instruction

111
00:05:57,590 --> 00:05:59,990
non-deterministically
from the valid set

112
00:05:59,990 --> 00:06:01,430
and invokes the dispatcher,

113
00:06:01,430 --> 00:06:04,985
which is responsible for
changing the system state.

114
00:06:04,985 --> 00:06:08,765
Since we pick instructions
non-deterministically,

115
00:06:08,765 --> 00:06:10,895
there isn't a need to

116
00:06:10,895 --> 00:06:14,830
model a program counter
for the instruction flow.

117
00:06:14,830 --> 00:06:19,370
As you can see in the example
at the bottom of the slide,

118
00:06:19,370 --> 00:06:23,180
the register's state starts with

119
00:06:23,180 --> 00:06:25,860
an initial zero values and

120
00:06:25,860 --> 00:06:28,969
transitions according to the
instructions being executed.

121
00:06:28,969 --> 00:06:30,920
In the final step,

122
00:06:30,920 --> 00:06:36,170
we have an operation that takes
the values d1 and they tron into

123
00:06:36,170 --> 00:06:38,570
registers and computes a new value

124
00:06:38,570 --> 00:06:43,140
h2 that is stored in the
destination register.

125
00:06:43,430 --> 00:06:46,940
So far the behavior
of our system only

126
00:06:46,940 --> 00:06:51,020
modeled the algorithmic state that
is visible to the programmer.

127
00:06:51,020 --> 00:06:53,000
Under speculation, the CPU may

128
00:06:53,000 --> 00:06:55,250
execute additional
instructions starting from

129
00:06:55,250 --> 00:06:58,160
a programmer visible state either

130
00:06:58,160 --> 00:07:00,200
discarding the resulting
registered values

131
00:07:00,200 --> 00:07:01,975
or committing the status value,

132
00:07:01,975 --> 00:07:03,860
depending on how, for example,

133
00:07:03,860 --> 00:07:07,490
a previously delayed
condition check was resolved.

134
00:07:07,490 --> 00:07:09,590
In our model, we only consider

135
00:07:09,590 --> 00:07:12,970
the discarding of the
speculative register state.

136
00:07:12,970 --> 00:07:15,934
Since the programmer's
view of the registers

137
00:07:15,934 --> 00:07:18,440
remains unchanged during speculation,

138
00:07:18,440 --> 00:07:20,840
we model it as another instance of

139
00:07:20,840 --> 00:07:23,255
the previously defined SimpleCPU

140
00:07:23,255 --> 00:07:25,960
but with a separate register state.

141
00:07:25,960 --> 00:07:29,180
The new slightly more
complex system consists of

142
00:07:29,180 --> 00:07:33,440
two simple CPUs that share
the memory and the cache,

143
00:07:33,440 --> 00:07:36,460
but with separate register backs.

144
00:07:36,460 --> 00:07:40,385
Know that not all instructions
are available on the speculation,

145
00:07:40,385 --> 00:07:43,610
especially those that
can modify memory,

146
00:07:43,610 --> 00:07:45,985
a such state is programmer visible.

147
00:07:45,985 --> 00:07:48,450
From now on I will refer to

148
00:07:48,450 --> 00:07:52,295
executed instructions as those
steps visible to the programmer

149
00:07:52,295 --> 00:07:56,165
affecting the algorithmic state
and speculated instructions

150
00:07:56,165 --> 00:07:59,000
as those where the
speculative register state

151
00:07:59,000 --> 00:08:01,175
will eventually be discarded.

152
00:08:01,175 --> 00:08:06,335
You can see the example
figure where states 0 and

153
00:08:06,335 --> 00:08:09,800
1 of the executing CPU registers

154
00:08:09,800 --> 00:08:13,325
are kept in sync with
a speculating one.

155
00:08:13,325 --> 00:08:15,670
If the executed instructions are

156
00:08:15,670 --> 00:08:19,625
delayed but the
speculation continuous,

157
00:08:19,625 --> 00:08:21,500
the instructions only affect

158
00:08:21,500 --> 00:08:25,940
the speculative register path
and are eventually discarded.

159
00:08:25,940 --> 00:08:27,990
Actually plus specification of

160
00:08:27,990 --> 00:08:32,345
our new system behavior with
two instances of a simple CPU.

161
00:08:32,345 --> 00:08:36,910
The SpecCPU instance has
its own regs variable.

162
00:08:36,910 --> 00:08:39,740
We also need two
instruction dispatchers,

163
00:08:39,740 --> 00:08:41,555
one for the executing CPU,

164
00:08:41,555 --> 00:08:44,930
which keeps the speculative
registers in sync,

165
00:08:44,930 --> 00:08:47,585
and another for the speculating CPU,

166
00:08:47,585 --> 00:08:51,745
which leaves the programmer
visible state unchanged.

167
00:08:51,745 --> 00:08:53,855
As I mentioned earlier,

168
00:08:53,855 --> 00:08:58,820
only a subset of instructions
are available under speculation.

169
00:08:58,820 --> 00:09:03,935
We limited it here to memory
loads and operations on values,

170
00:09:03,935 --> 00:09:10,100
but other instructions like the
Havoc set could be included.

171
00:09:10,100 --> 00:09:13,190
The next step is defined as

172
00:09:13,190 --> 00:09:15,200
a non-deterministic choice of either

173
00:09:15,200 --> 00:09:18,650
executed or speculated instructions.

174
00:09:18,650 --> 00:09:22,099
Since this doc is about
security properties,

175
00:09:22,099 --> 00:09:24,230
before stating them,
we need to define

176
00:09:24,230 --> 00:09:27,175
the roles and projections
of the system state.

177
00:09:27,175 --> 00:09:30,230
We consider the victim to
be a high-privileged mode,

178
00:09:30,230 --> 00:09:32,530
for example, an
operating system kernel.

179
00:09:32,530 --> 00:09:36,785
It has its own state consisting
of registers and high memory.

180
00:09:36,785 --> 00:09:41,045
Such high memory may or may not
contain secret information.

181
00:09:41,045 --> 00:09:44,750
The high mode can accept
input and provide output,

182
00:09:44,750 --> 00:09:47,510
for example, system poll arguments

183
00:09:47,510 --> 00:09:49,790
and results returned to the caller.

184
00:09:49,790 --> 00:09:53,615
In our model, these consist
of all the low memory

185
00:09:53,615 --> 00:09:57,955
and the registers or when entering
or exiting the high mode.

186
00:09:57,955 --> 00:10:01,140
The attacker is a low-privilege
mode, for example,

187
00:10:01,140 --> 00:10:03,725
a user application with a low state

188
00:10:03,725 --> 00:10:07,280
consisting of registers
and low memory.

189
00:10:07,280 --> 00:10:11,585
Without timing side-channel,
the attacker can only observe

190
00:10:11,585 --> 00:10:15,870
the content of the low
memory and some registers.

191
00:10:15,870 --> 00:10:18,100
With cache side-channels, however,

192
00:10:18,100 --> 00:10:22,610
the attacker can probe whether the
memory location is cached using

193
00:10:22,610 --> 00:10:24,980
the difference in the execution time

194
00:10:24,980 --> 00:10:27,845
or the corresponding memory access.

195
00:10:27,845 --> 00:10:30,920
We have not explicitly included time

196
00:10:30,920 --> 00:10:34,205
as a state in this model, however,

197
00:10:34,205 --> 00:10:38,989
we define the attacker observable
state as a function mapping

198
00:10:38,989 --> 00:10:44,765
low addresses to both the values
in memory and the cached state.

199
00:10:44,765 --> 00:10:48,875
Because of the AccessOK
operator defined earlier,

200
00:10:48,875 --> 00:10:51,350
the model does not
permit the attacker to

201
00:10:51,350 --> 00:10:55,770
directly observe any
high memory locations.

202
00:10:56,070 --> 00:11:00,640
Confidentiality is a non-interference
property stating that

203
00:11:00,640 --> 00:11:03,025
the attacker cannot observe anything

204
00:11:03,025 --> 00:11:05,860
other than what the victim
allows in its output.

205
00:11:05,860 --> 00:11:08,680
In other words, the
attacker's observation is

206
00:11:08,680 --> 00:11:10,165
a deterministic function of

207
00:11:10,165 --> 00:11:14,110
only the initial state victim's
output and its own actions.

208
00:11:14,110 --> 00:11:17,529
We modeled this as a property over

209
00:11:17,529 --> 00:11:20,470
two LowState-identical behaviors that

210
00:11:20,470 --> 00:11:23,920
must have the same low
observation functions.

211
00:11:23,920 --> 00:11:26,500
We assume that the
individual commands

212
00:11:26,500 --> 00:11:28,660
or instructions are deterministic.

213
00:11:28,660 --> 00:11:31,540
This means that for a given
state and instruction,

214
00:11:31,540 --> 00:11:35,330
always leads to the same
next state in all behaviors.

215
00:11:35,330 --> 00:11:38,115
We'll define integrity since it

216
00:11:38,115 --> 00:11:40,110
compliments the
confidentiality property,

217
00:11:40,110 --> 00:11:43,895
although this is not included
in the TLAplus model.

218
00:11:43,895 --> 00:11:47,230
This property holds when the
victim's execution is not

219
00:11:47,230 --> 00:11:50,995
affected by the attacker
beyond the input provided.

220
00:11:50,995 --> 00:11:55,390
This can be modeled similarly
with a hybrid property over

221
00:11:55,390 --> 00:11:59,815
two behaviors that start from
the same initial high state,

222
00:11:59,815 --> 00:12:01,450
have the same input,

223
00:12:01,450 --> 00:12:05,930
and must always lead to the
same high state and output.

224
00:12:06,270 --> 00:12:09,930
TLAplus does not directly
support hybrid properties.

225
00:12:09,930 --> 00:12:12,120
However, we can create
a new specification

226
00:12:12,120 --> 00:12:14,910
encompassing two
behaviors: the start from

227
00:12:14,910 --> 00:12:17,700
the same LowState and
low observation and

228
00:12:17,700 --> 00:12:21,320
perform a similar execution
or speculation next step.

229
00:12:21,320 --> 00:12:26,065
I highlighted the requirement
that following an executing step,

230
00:12:26,065 --> 00:12:29,185
the registers are the
same in both behaviors.

231
00:12:29,185 --> 00:12:31,120
What we actually want here is that

232
00:12:31,120 --> 00:12:33,190
the low observations are the same for

233
00:12:33,190 --> 00:12:36,010
an executing step under
the assumption that

234
00:12:36,010 --> 00:12:37,810
the algorithm itself does not

235
00:12:37,810 --> 00:12:40,180
leak information
through side channels.

236
00:12:40,180 --> 00:12:44,530
Using regs equality may be
a bit more restrictive,

237
00:12:44,530 --> 00:12:49,735
but I don't think TLAplus
supports LowObs prime notation.

238
00:12:49,735 --> 00:12:52,240
For the confidentiality property,

239
00:12:52,240 --> 00:12:54,640
the new hyperspec must imply

240
00:12:54,640 --> 00:12:58,070
that the observation
functions are identical.

241
00:12:58,200 --> 00:13:02,380
Now that we defined our
specification and its properties,

242
00:13:02,380 --> 00:13:04,240
we can run it through TLC,

243
00:13:04,240 --> 00:13:06,400
the TLAplus model checker probably

244
00:13:06,400 --> 00:13:09,655
expected the confidentiality
property does not hold.

245
00:13:09,655 --> 00:13:13,630
This figure represents
the reported trace

246
00:13:13,630 --> 00:13:16,750
of the two behaviors
of our specification.

247
00:13:16,750 --> 00:13:19,690
For space reasons, the trace has been

248
00:13:19,690 --> 00:13:23,515
simplified and the low
mode execution removed.

249
00:13:23,515 --> 00:13:26,770
As you can see, we have two
behaviors starting from

250
00:13:26,770 --> 00:13:30,535
the same LowState
and low observation,

251
00:13:30,535 --> 00:13:31,780
but with

252
00:13:31,780 --> 00:13:35,934
potentially different high-stake
holding secret information.

253
00:13:35,934 --> 00:13:39,070
The r1 register contains
an untrusted value,

254
00:13:39,070 --> 00:13:42,025
d1, provided by the attacker.

255
00:13:42,025 --> 00:13:47,440
The victim sets the r3
register to a higher address,

256
00:13:47,440 --> 00:13:52,600
h1 that is not expected
to contain any secret.

257
00:13:52,600 --> 00:13:56,020
Subsequently, the algorithm
decides that the d1

258
00:13:56,020 --> 00:13:59,800
is out of bound and changes the
executed instructions flow.

259
00:13:59,800 --> 00:14:02,050
However, the CPU continues to

260
00:14:02,050 --> 00:14:04,840
use this untrusted value
under speculation.

261
00:14:04,840 --> 00:14:09,235
An operation combining
the safe address,

262
00:14:09,235 --> 00:14:11,935
h1, with the untrusted value, d1,

263
00:14:11,935 --> 00:14:13,930
yields a new address, h2,

264
00:14:13,930 --> 00:14:16,360
which contains secret information

265
00:14:16,360 --> 00:14:19,330
and therefore is different
in the two behaviors.

266
00:14:19,330 --> 00:14:21,490
The load from h2 sets

267
00:14:21,490 --> 00:14:24,415
the speculative register r1

268
00:14:24,415 --> 00:14:29,770
to two different values
in the two behaviors.

269
00:14:29,770 --> 00:14:31,570
These are used to compute

270
00:14:31,570 --> 00:14:35,410
two different low addresses
and the speculative loads from

271
00:14:35,410 --> 00:14:38,350
those addresses leave
two different imprints

272
00:14:38,350 --> 00:14:40,570
in the low address cache,

273
00:14:40,570 --> 00:14:43,420
the confidentiality
property is not violated

274
00:14:43,420 --> 00:14:45,370
since the low observation function is

275
00:14:45,370 --> 00:14:48,260
different in the two behaviors.

276
00:14:48,450 --> 00:14:53,110
This trace resembles
at an abstract level,

277
00:14:53,110 --> 00:14:56,000
a Spectre Variant 1 exploit.

278
00:14:57,120 --> 00:15:00,640
What do the speculative
execution vulnerabilities

279
00:15:00,640 --> 00:15:03,250
map onto our abstract model?

280
00:15:03,250 --> 00:15:08,380
We just seen a Spectre Variant
1 example on the previous slide

281
00:15:08,380 --> 00:15:13,375
with a confidentiality property
violated by the speculating CPU.

282
00:15:13,375 --> 00:15:16,810
The softer workarounds
usually try to make

283
00:15:16,810 --> 00:15:20,440
speculative instructions flow
not depend on untrusted data.

284
00:15:20,440 --> 00:15:23,335
For example, behind the
conditional branch,

285
00:15:23,335 --> 00:15:30,435
the register holding such data
would be zero if deemed unsafe.

286
00:15:30,435 --> 00:15:33,270
Spectre Variant 2 is

287
00:15:33,270 --> 00:15:36,780
an integrity violation
in the speculating CPU.

288
00:15:36,780 --> 00:15:40,170
If we consider a
program counter as part

289
00:15:40,170 --> 00:15:43,750
of the register set on
certain CPU implementation,

290
00:15:43,750 --> 00:15:46,300
its values can be driven

291
00:15:46,300 --> 00:15:49,840
from the attacker by training
the branch predictor.

292
00:15:49,840 --> 00:15:53,140
Know that this is only
for the speculating CPU.

293
00:15:53,140 --> 00:15:57,175
This would make it much easier
to employ a Spectre v1 attack.

294
00:15:57,175 --> 00:15:59,170
The workarounds limit the influence

295
00:15:59,170 --> 00:16:02,530
one mode has over the branch
prediction in another mode,

296
00:16:02,530 --> 00:16:05,770
either by explicit
flushing or by tagging

297
00:16:05,770 --> 00:16:09,219
the branch predictor entries with
additional context information.

298
00:16:09,219 --> 00:16:14,470
Meltdown or Variant 3 doesn't
even need the victim to run,

299
00:16:14,470 --> 00:16:17,185
a speculative load in the low mode

300
00:16:17,185 --> 00:16:20,830
is not checked against the
high access permissions.

301
00:16:20,830 --> 00:16:24,175
While the value loaded
cannot be used directly,

302
00:16:24,175 --> 00:16:28,240
it can be made through a set
of operations to leave a trace

303
00:16:28,240 --> 00:16:32,515
in the cache state that's probed
with executed instructions.

304
00:16:32,515 --> 00:16:36,040
The workaround on the
affected CPUs is to enforce

305
00:16:36,040 --> 00:16:38,095
such permission check by other means

306
00:16:38,095 --> 00:16:41,035
like kernel page-table isolation.

307
00:16:41,035 --> 00:16:46,315
Spectre Variant 4 is in some
ways similar to Variant 1,

308
00:16:46,315 --> 00:16:49,615
only that the secret
high memory location

309
00:16:49,615 --> 00:16:51,670
is one written by the victim

310
00:16:51,670 --> 00:16:56,950
with an non-private value and
considered algorithmically safe.

311
00:16:56,950 --> 00:16:59,980
Under speculation, a
load instruction from

312
00:16:59,980 --> 00:17:03,220
this address is allowed to
read data prior to the store,

313
00:17:03,220 --> 00:17:06,895
therefore revealing potentially
secret information.

314
00:17:06,895 --> 00:17:11,230
The model can be adopted
for this vulnerability by

315
00:17:11,230 --> 00:17:15,685
using separate memory state in
the speculating CPU instance.

316
00:17:15,685 --> 00:17:18,550
The workaround is
either a hardware knob

317
00:17:18,550 --> 00:17:21,940
disabling this behavior with
some performance penalty,

318
00:17:21,940 --> 00:17:26,270
or explicit software use
of specific barriers.

319
00:17:26,640 --> 00:17:28,750
Only the disclosure of

320
00:17:28,750 --> 00:17:31,735
the speculations by
channel vulnerabilities,

321
00:17:31,735 --> 00:17:34,000
the ARM architecture introduce

322
00:17:34,000 --> 00:17:37,195
new barriers to help with
the software workarounds.

323
00:17:37,195 --> 00:17:40,750
In this model, we haven't

324
00:17:40,750 --> 00:17:45,580
included data value prediction
in the speculating CPU,

325
00:17:45,580 --> 00:17:47,830
but that's equivalent to allowing

326
00:17:47,830 --> 00:17:50,575
the Havoc instruction
set on the speculation.

327
00:17:50,575 --> 00:17:54,190
CSDB prevents those instructions.

328
00:17:54,190 --> 00:17:57,520
SSBB does not allow

329
00:17:57,520 --> 00:18:01,765
a speculative load from
reading values prior to store.

330
00:18:01,765 --> 00:18:08,065
We just mentioned it in the Spectre
example on the previous slide.

331
00:18:08,065 --> 00:18:11,560
SB disallows the set of

332
00:18:11,560 --> 00:18:13,690
instructions on the speculation

333
00:18:13,690 --> 00:18:17,030
that may have an
observable side effect.

334
00:18:17,550 --> 00:18:19,780
There are several papers on

335
00:18:19,780 --> 00:18:22,555
noninterference and
observational determinism,

336
00:18:22,555 --> 00:18:25,315
but not enough space
to be written here.

337
00:18:25,315 --> 00:18:29,919
I'll mention the ARM whitepaper
on speculative execution.

338
00:18:29,919 --> 00:18:31,945
This is not the both for more models,

339
00:18:31,945 --> 00:18:35,710
but rather a description of
the Spectre and Meltdown

340
00:18:35,710 --> 00:18:40,225
variants as they apply
to the ARM processors.

341
00:18:40,225 --> 00:18:43,060
Of course, this model
wouldn't have been possible

342
00:18:43,060 --> 00:18:47,060
without the Specifying Systems book.

343
00:18:47,280 --> 00:18:53,320
Source of inspiration on
modeling the security properties

344
00:18:53,320 --> 00:19:00,670
in this presentation was Subramanyan's
paper on secure enclaves.

345
00:19:00,670 --> 00:19:04,330
This paper does not cover
speculative execution,

346
00:19:04,330 --> 00:19:07,795
but it does cover slide channels.

347
00:19:07,795 --> 00:19:11,259
Guarnieri's SPECTECTOR paper proposes

348
00:19:11,259 --> 00:19:14,695
a slightly different approach
to modeling speculation,

349
00:19:14,695 --> 00:19:17,710
but is focused mostly on analyzing

350
00:19:17,710 --> 00:19:21,460
specific code gadgets
for vulnerabilities.

351
00:19:21,460 --> 00:19:25,195
If you are interested in the
TLAplus model presented here,

352
00:19:25,195 --> 00:19:27,710
please follow the link.

353
00:19:28,140 --> 00:19:31,450
Thank you for watching
and I hope to see you at

354
00:19:31,450 --> 00:19:35,480
the TLAplus community
event in October, 2020.