Storing passwords, or the risk of a no-salt diet

A while back, I wrote about passwords and promised a later post on salting.  This is it: a deeper look at how servers should accept and store passwords.  This is a complement to the usual articles on passwords, which focus on the user (you know the ones: “pick strong passwords”); here, I’ll be looking at the server side, and in particular how to store passwords for web sites.

Shipping security

It’s a ritual we’ve all grown accustomed to: something needs a software update to repair security flaws.  Traditionally, it’s been our computer; increasingly, it’s our smartphones or their apps.  In the not very distant future (possibly now, for some of us), it will be our printers, our thermostats, our cars, our “anything that uses software”—and that will be more or less everything.  WiFi-controlled light bulbs are already on sale in some countries; if it’s WiFi-controlled, it m

COPPA and signaling

As has been widely reported, the FTC recently amended its COPPA Rule enforcing the Children’s Online Privacy Protection Act. There’s a lot to be said about the new amendments to the Rule—indeed, a lot is being said—but as this is the FTC Tech Blog, I’m going to restrict my comments to technical aspects. Today, I’m going to talk about signaling—the way that a website can signal its COPPA status to the operators of other sites who provide it with some of the content that users see.

Complexity and scams

All of us use gadgets—cars, phones, computers, what have you—that we don’t really understand.  We can use them, and use them very effectively, but only because a great deal of effort has been put into making them seem simple.  That’s all well and good—until suddenly you have to deal with the complexity.  Sometimes, that happens because something has broken and needs to be fixed; other times, though, scammers try to exploit the complexity.  The complaints released today by the FTC illustrate this nicely (the press release is here; it

Password compromises

There are many problems with passwords; Ed Felten explained the issues very well a few months ago.  One big problem is that they can be stolen: the bad guys can somehow get hold of your password and use it themselves.  But how can this happen?  It turns out that there are many different ways, each of which requires a different protective strategy.


I’m delighted to succeed Ed Felten as Chief Technologist of the Federal Trade Commission. He’s a hard act to follow! But what does the FTC do, and what is the role of a technologist?

Reasoning about information: an example

One of the reasons it's hard to think carefully about privacy is that privacy is fundamentally about information, and our (uneducated) intuition about information is often unreliable.

As a teacher, I have tried different approaches to helping students get over this barrier.  It's not too hard to teach the theory, so that students learn how to manipulate logical formulas to answer contrived story problems about information and inference.  What is more difficult is augmenting the formal theory with a more accurate intuition that is useful outside the classroom.

Privacy by design: frequency capping

One of the principles of Privacy by Design, as advocated in the FTC Privacy Report, is that when you design a business process, it's a best practice to think carefully about how to minimize the information you collect, retain, and use in that process.  Often, you can implement the feature you want, with a smaller privacy footprint, if you think carefully about your design alternatives.

The Problem with Passwords

We use passwords all the time.  Sometimes they're called "PINs" or "access codes" or "lock combinations" but they amount to the same thing, a sequence of symbols that must be provided in order to get access to something.    Passwords have one big advantage: ease of use.  But this comes with several disadvantages.

Is aggregate data always private?

I have been writing recently about data and privacy.   Today I want to continue by talking about aggregate data.   A common intuitions is aggregate data--information averaged or summed over a large population--is inherently free of privacy implications.   As we'll see, that isn't always right.

What does it mean to preserve privacy?

In recent posts, I explained why hashing and pseudonyms often fail to provide anonymity.  These problems, and the well-known examples of people re-identifying supposedly anonymized data sets, might tempt you into believing that any data set can be re-identified given enough effort or that there is just no way to provide access to data in a privacy-preserving way.   But those conclusions would be incor

Syncing and the FTC’s Myspace settlement

Today the FTC announced a proposed settlement with Myspace, on charges that the company broke its privacy promises to consumers. I want to focus today on one of the FTC's charges, relating to possible syncing of identifiers.

Myspace, a popular social network, assigns each of its users a numeric identifier called a "Friend ID". If you know someone's Friend ID, you can use it to get their public information, by accessing the URL<Friend ID>.

Are pseudonyms "anonymous"?

Let's continue our discussion of "anonymous" data by talking about pseudonyms.

A pseudonym is any kind of identifier, other than a name, that is associated with a person or (what often amounts to the same thing) a device.  Pseudonyms are very common.   Examples include the random ID value in a tracking cookie; a device ID such as a WiFi MAC address or a phone's UDID; a synthetic identifier such as an "OpenUDID"; a mobile phone number; or a Twitter handle.

Does Hashing Make Data “Anonymous”?

One of the most misunderstood topics in privacy is what it means to provide “anonymous” access to data.  One often hears references to “hashing” as a way of rendering data anonymous.   As it turns out, hashing is vastly overrated as an “anonymization” technique.   In this post, I’ll talk about what hashing is, and why it often fails to provide effective anonymity.