Some Thoughts on Serial Numbers in Intel CPUs
		---------------------------------------------
		Ronald L. Rivest
		MIT Laboratory for Computer Science
		1/26/1999 (with slight revisions 8/23/99)

Today's New York Times contains an article, ``Intel Alters Plan Said
to Undermine PC Users' Privacy.'' (NYT, 1/26/1999, page 1) The article
explains that EPIC and other groups are calling for a boycott of the
new Intel CPU because each CPU will contain an unique serial number
that can be read by any program, unless this feature is turned off by
the user.  The concern is that this feature might contribute to the
loss of privacy by users, even as it contributes to electronic
commerce and guards against software piracy.

I must admit that I was a little surprised by this reaction to the
Intel announcement, which was made at the annual RSA Data Security
Conference in San Jose last week.  It hadn't occurred to me that someone
might see such a feature as a threat to privacy.

It is worth noting that many computers on the Internet already have
unique identifying numbers: the IP addresses used to route information
to them.  Each computer on the Internet is uniquely identified by its
IP address.  (Some computers have more than one IP address.)
Furthermore, it is not hard for a typical application program to
determine the IP address of the computer it is running on.  Thus, a
CPU serial number would not in these cases add anything new, since the
computer is already uniquely identified by the IP address.  However,
many users that have dial-up connections to the Internet have IP
addresses dynamically assigned by their Internet Service Provider
(ISP), so that in these cases the IP address only uniquely identifies
the user's computer temporarily.  Nonetheless, the presence of a
new identifying number is not something dramatically different than what
exists today for many users.

There are other ways in which a computer can be uniquely identified by
software running on that computer.  For example, there is normally a
unique number on each board connecting a computer to the Ethernet.  This
could also be used as a unique identifying number of the computer.  

The Intel proposal would give every CPU a unique identifying serial
number that could be easily read by a program in a standard way.
While Intel asserts that this feature could be turned off by the user,
they don't say how this would be implemented.  For example, if the
feature is under program control, then a program could turn on the
feature, read the number, and then turn it off.  On the other hand, if
the feature is under manual control (e.g. a new switch on the
keyboard), then how is the user to know that only the program that he
wishes will be enabled to read the serial number?  A modern computer
can be running many processes at once, and there may be a corrupted
process running in addition to the normal processes that could sample
the serial number and save it away.  Without further details from Intel,
it is hard to see how they can make this feature controllable in a 
secure way.  Probably they have some thoughts on how to do this.

But a real concern is that the user will be forced to leave the serial
number feature turned on, in order to be able to execute programs that
he has purchased or downloaded off the Internet.  If it becomes
standard for a program to refuse to run unless the feature is turned
on, then the user will eventually give up and leave the serial number
feature always enabled.  I think this is a not likely evolution of the
state of affairs.

How damaging to a user's privacy is the serial number feature?  Well,
one risk is that Internet applets could leak this number in their
communications back to their home server.  This is not in and of
itself a privacy problem.  The risk is that servers could get together
and correlate (link) their information on users, using the CPU serial
number as a common identifying tag.  Website A would know that some
user with CPU serial number 4136795 was browsing sites about some
nasty disease, and Website B would find out that a user with name
Mary Smith and credit-card number 41556792346601 was connecting from
a computer with CPU serial number 4136795.  Putting two and two together,
they discover that Mary Smith is interested in some nasty disease.

While this is a possible concern, I still find it a bit surprising
that this sort of issue is raised in a country where credit cards are
so prevalent, and where everyone's buying habits are minutely detailed
and correlated by the credit card companies.  I guess the concern may
be one of control; people are happy to give up their privacy when
using their credit cards, because they know that they could in
principle not use the cards, whereas CPU serial numbers are bothersome
because users may not have such discretionary control over their use.
(This seems a bit weak as an argument, since there is no easy way to
make purchases over the Internet except by using a credit card.)  I
don't really see the difference between the option (?) not to use a
credit card number, and the option (?) to turn off the CPU serial
number feature.  And credit cards are perhaps a more insidious problem
because they are already linked to your name, whereas the Intel CPU's
would be sold without any record-keeping for anyone to know who has
the CPU with which serial number.

Nonetheless, the privacy issue, once raised, raises the question as
to whether the benefits gained are worth the privacy risks, whatever
you assess those to be, and whether there might not be better ways to
achieve those benefits without incurring the risks.  At the end of
this paper, I sketch a proposal for replacing serial numbers with a
functionality that may accomplish these goals.

First, we must ask: what are the benefits of serial numbers on a CPU?

To my mind, the benefits of a serial number scheme are that it might
help fight the battle against software piracy, and that it might
assist more generally in helping to protect intellectual property
rights.  Distributors of software and music might be able to (albeit
weakly) guarantee that the software and music they distribute would be
runnable or playable only on designated CPU's.  A software program
(e.g. Microsoft Office) would check that it was running on an
authorized CPU, by checking the serial number before (and even during)
execution.  If not, it would halt execution.  Similarly, a music
player could check that the music that was downloaded was specifically
intended to run on that CPU.  If not, the music wouldn't play.

Such schemes have been around for a long time.  Some manufacturers
provide "dongles" to attach to your PC that provide the PC with a
unique serial number, where one was previously lacking, allowing
software that checks for the dongle number to run only when the dongle
is present.  (The dongle has an advantage over the CPU serial number
in that it can be moved to a new machine, when the user upgrades, whereas
the same is not true of the CPU serial number.)

It is well recognized that such simple schemes are often not hard to
defeat, by spoofing the dongle-checking routine to believe that it has
queried the dongle when it has not, or by modifying the software so that
it no longer checks for the dongle.  Similarly, it would be possible in
principle to modify software that checks the CPU serial number so that it
no longer checks for this number.  (It would, however, presumably be hard
to spoof the checking routine, since the CPU serial number is directly
available by executing a certain machine instruction.)  

Extensions to the basic idea involve incorporating "essential functionality"
into the dongle, rather than having it contain just a serial number.  For
example, the dongle could contain a key subroutine for the program.  (But
then this dongle is only usable for that one program.)  In another variant,
the dongle contains a secret key that can be used to decrypt portions of
the code so that they can be executed by the PC.  The dongle might even
contain a CPU itself so that an encrypted subroutine could be loaded
into the dongle and executed there.  Steve Kent's Master's thesis [2] gives
a discussion of some of these variants.

I note that there is an issue of "key management" or "serial number
management" involved in these schemes.  That is, the user (the
purchaser of the software) must somehow let the manufacturer (or
distributor) know the serial number or secret key of the CPU or
dongle, so that the manufacturer can prepare a version of the software
that runs only on that CPU (or in the presence of that dongle).  This
is explicitly an "identification" procedure.  The user needs to
identify himself (or at least identify his CPU) so that the
manufacturer can prepare the software.  Thus, the user is clearly
giving up his privacy in such a scenario.

Is there some way in which you could get the benefits of protection
against software piracy while not having such an explicit
identification scenario as a necessary part of the process?  

I think it is fair to say that a manufacturer is only likely to be
concerned about piracy when he is being paid for the software (or
music or whatever) that he wishes to distribute.  Who cares about
piracy of free software?  

But this implies that schemes for software protection are always going
to violate the user's privacy (or at least reveal his identity),
unless an anonymous payment scheme is used to pay for the software.
By paying for the software in the first place, the user has already
given away who he is.  While schemes for anonymous payment are
certainly possible in principle, they have not caught on in practice.
Perhaps it is best to assume that this is likely to remain true, at
least for a while.

On the other hand, even if one were to grant that one must reveal
one's identity in order to purchase intellectual property like
software (and this is not really a given, since some corporations
purchase software en masse with a site license, for example), it would
still be potentially bothersome to have a mechanism that is designed to
prevent software piracy (for paid transactions) turn out to be usable to
compromise further a user's privacy in other situations (e.g. for free
transactions).  The CPU serial number has the risk of being bothersome
in exactly this way, since the CPU can't really tell if it is being
queried in order to facilitate electronic commerce (by preventing
piracy) or to facilitate snooping on individuals (by giving away an
identifying tag on his free transactions).

Here is a simple proposal for a variant scheme that might satisfy the
desiderata for the current situation: it facilitates electronic
commerce without providing unique identifiers.  I'm sure that my
crypto colleagues can invent many further elaborations of this simple
idea; further improvements are certainly going to be possible.

First of all, we eliminate the serial number from the CPU.  There is
no serial number, and so it can't be queried for, or used as an
identifier for the user of the CPU.

Second, we give each CPU a unique secret key Ki.  These secret keys
may be 128-bit AES (Advanced Encryption Standard) keys, for example.
No two chips have the same key Ki.  The keys might be randomly
generated by Intel as it manufactures the CPUs.  We trust Intel not to
keep copies of these keys.  (This is a soft spot in this design, which
can presumably be addressed by having the chip generate Ki and store
it in nonvolatile memory without revealing it, or by having a
variation on the rest of the scheme somehow.)  There is no way for
a user of the CPU to determine Ki; it can't be "read out" like a
serial number.

Third, we give the CPU two new instructions: a "challenge" instruction
and a "decrypt and compare" instruction.  The "challenge" instruction
causes the CPU to do a randomized encryption of a supplied challenge,
and return the resulting ciphertext.  The "decrypt and compare"
instruction causes the chip to determine if two such ciphertexts could
have been produced on the current CPU from the same challenge.
Details in a moment.

Note that the Intel proposal also proposes that the Intel Pentium III
architecture will allow the chip to generate random numbers from
thermal noise.  Presumably there is a new instruction that causes the
chip to return a register (or several registers) full of random bits.
Generating random numbers is an essential requirement for our proposal
here, so it is convenient that Intel has proposed this capability.

The "challenge" instruction works as follows: the chip takes in a (say)
64-bit challenge c.  It then generates a (say) 64-bit random number r, using
the random number generation circuitry already announced.  It then
returns as the result of the challenge instruction the ciphertext:
 	               C(c,r) = AES(Ki,cr)
That is, it returns the encryption using the AES algorithm, under control
of the key Ki, of the plaintext consisting of the concatenation of the
challenge c with the random value r.  (The first 64-bit half of the
plaintext is c, the second 64-bit half of the plaintext is r.)  The resulting
128-bit ciphertext C(c,r) is returned by the chip in an appropriate register
or set of registers.  The AES algorithm (not yet chosen) takes in 128-bit
plaintext values and returns 128-bit ciphertext values, under control of
a 128 (or 192 or 256)-bit key.

The "decrypt and compare" instruction takes in two values C1 and C2, and
decrypts them using the chip's secret key Ki, to obtain (c1,r1) and (c2,r2),
where C1 = C(c1,r1) and C2 = C(c2,r2).  That is, C1 was produced (or could
have been produced) by the challenge instruction on input challenge c1, and
C2 has produced (or could have been produced) by the challenge instruction
on input challenge c2.  The chip returns "true" if c1 = c2, and returns
"false" otherwise.  

Note again that the challenge instruction is randomized---it returns
(with very high probability) a different result every time it is
invoked, even if it is invoked with the same challenge.  Thus, it is not
usable as a way of producing a unique "serial number" for the chip.  For
example, the result of running the challenge instruction on input challenge
"0" is always changing, so that it can't be used to identify the chip.

I also note that although the scheme proposed here involves an
encryption operation, it is not possible to use the chip to "get at"
the underlying AES encryption and thus perform encryption efficiently.
This is important if one must live (as Intel currently must) with the
current set of (defective, in my mind) export control laws on
encryption.  Chips with this scheme on them could presumably be
exported without difficulty.

Now: how does a manufacturer use these instructions to provide software
that can only be run on a particular CPU?

The "serial number management" or "key management" process that we had
before for dongles now becomes the following three-step process.
First, the user runs a challenge instruction on some challenge on his
CPU.  The challenge might be supplied by the manufacturer, or chosenly
randomly by the user.  Second, the user then informs the manufacturer
of the challenge c and the response C1 = C(c,r) that he obtained from
the chip.  Third, the manufacturer supplies the user with custom
software that has embedded within it the ciphertext C1 and a test of
the form: give the challenge c to the chip, apply the "challenge"
instruction, and then use the "decrypt and compare" instruction to
compare the result of the challenge instruction and C1.  If the
"decrypt and compare" instruction returns "true", then proceed to
execute the software.  Otherwise, abort.

Software produced this way will only run on the CPU that produced the
original response C1.  This allows one to protect against software
piracy, in that a manufacturer can produce software that runs on only
one CPU.  (The scheme extends easily to handle the case that the user
owns multiple CPUs, by embedding multiple ciphertexts in the software,
and seeing if any of them compare successfully.)

Note that manufacturers can not get together off-line to compare what
they know, since all they have are ciphertexts produced using unknown
keys for plaintexts of which they only know half.  There is no way to
"link" together different results of the "challenge" instruction,
without using the very same chip on which those results were produced.

Of course, this scheme has the problem that a user can not upgrade
his hardware easily; all of his purchased and protected software
also needs to be upgraded to run on the new CPU.

This finishes my description of the scheme.  I suppose that every
scheme needs a name, so why don't we call this the "C/DAC" scheme
(challenge/decrypt and compare).  

This note needs substantial elaboration to include additional pointers
to other relevant work  (e.g. Canetti's work on randomized hash
functions, other schemes for preventing software piracy, etc...).

This note also needs the usual caveats that even a scheme like this is
not so hard to defeat, since you can presumably modify the purchased
software to remove the checks it makes, just as one could modify the
software to remove checks on the serial numbers.  But it is presumably
somewhat worthwhile nonetheless to have a software piracy protection
scheme that provides protection against naive but malicious users that
would copy code if they could, but who don't have the skill needed to
hack the code.  (It is arguable whether this is a huge benefit, since
one skilled hacker can then provide the code to all of his friends...)

However, it is interesting to see that the benefits that seem to
accrue to the serial number scheme can be obtained without providing a
means for violating a user's privacy by facilitating the linking of
various transactions using the CPU serial number as an identifier.
(If there are other applications of the serial number scheme that can
not be handled by the C/DAC scheme, I would be interested in hearing
about them...)

[1] ``Intel Alters Plan Said to Undermine PC Users' Privacy,''
    by Jeri Clausing, New York Times, January 26, 1999. Page 1.

[2] Kent, Stephen.  ``Protecting Externally Supplied Software in
    Small Computers,'' MIT Laboratory for Computer Science Technical
    Report TR--255. 1981.