How Cookies and Histories Work

This Guide explains how Internet Cookies work, and what their risks and benefits are. It also provides a of files where programs keep logs of your activity on your system, and talks a little about the risks and benefits of this and how to examine or erase those logs if you want to.

Cookies are information that a Web server that you visit, may store on your computer. This can be useful. For example, a site that you visit often can be more helpful if it recognizes you when you visit: when I go to Amazon, it knows who I am because it left a cookie on my computer earlier, and when it sees that cookie, it can pull up my preferences, wish list, and shopping cart without having to ask me about it.

on the other hand, the very notion of any old server writing information onto your disk should make you stop and think for a moment. Consider if any old business had the ability to drop an ad onto your desk (oh, wait, they do: we call it junk mail). Fortunately, it isn't too bad, because cookies have strong limitations:

Cookies cannot somehow poke around on your computer, pick out some information, and send it anywhere. They go the opposite way. So then, what is the risk?

The risk comes from collaboration between web sites. When you load a Web page, you are often loading different parts of it from different servers. For example, if you go to www.site.com, you may get back a page that contains all these things:

  1. A lot of general text and pictures from site.com
  2. Some images they refer to from other sites, like a clip-art library site
  3. Some banner ads they refer to from yet other sites, like doubleclick.net

Being able to include content from other sites in a Web page is very useful -- that way sites don't have to duplicate popular content as much. But it has a bad effect too:

Because your browser actually issues the request for the pictures or banner ads (based on the reference it found in the HTML document you thought was what you were viewing), you have initiated contact with those other servers. Thus, when they send you the picture or ad, they get to put a cookie on your machine.

A reference to a banner ad could be somple and innocuous, like:

   http://www.ad-example.com/ugly-ad.jpeg

They could then store a cookie on your machine, that said something like "customer-number=16180338". If that's all the ad company got, they couldn't learn much about you. They could know how many times customer 16180338 has visited any site that requests the ad, and at what times. But not who customer 16180338 is, or which sites.

More likely, however, the request for the cookie will have the name of the site you were at included, more like:

   http://www.ad-example.com/ugly-ad.jpeg?vendor=jmart

Now, the folks at ad-example.com know just what other site you were coming from. Or perhaps there's a little more in there:

   http://www.ad-example.com/ugly-ad.jpeg?vendor=jmart,vendorpage=2497

and now ad-example.com knows what you were looking at at just that time. Frequently, the cookies are not in plain text that you can read; so you never really know what information the sender may be stashing there.

The next step is that ad-example.com probably provides banner ads not just to site.com, but to a thousand other sites. Every time you visit any page at any of those vendors, that includes a banner ad from ad-example.com, ad-example.com checks it's cookie and learns that you're user 16180338. So it slowly accumulated a list of every such page you've been to, and when.

As this list grows, they can discover which sites you visit soon after which other ones, and target their advertising to pull you along in that direction, or push you in a different direction. They can also discover what times you tend to be on the Web; for how long; how long you spend at certain sites. This information can be sold; so far as I know there is nothing illegal about collecting and selling it.

Still, they have no certain way of determining who you are. That is, none of this gets them your name, home address, social security number, credit card numbers, phone numbers, email address, etc.

Then, one day, you decide to buy something from one of these vendors. You type personal information into a secure form. All of a sudden, that vendor can buy the statistical information from ad-example.com and connect it to you personally. Here's where many people start to get nervous, though not all.

Nothing prevents that vendor from passing on our personal information to ad-example.com, who can then pass it on to other vendors. If someone passes your credit card number along and it's misused, they'll likely get caught quickly and shut down. But if they're passing along less sensitive information, like your home address, you may see nothing more than a big increase in junk mail.

Your only protection against this kind of identity snooping is to never give information to vendors who will share it. Read the privacy policies of the sites you visit.

There is another way that ad-example (or any other site you visit) can connect information back to you. When you request a page from some server, your browser must send your Internet Protocol (IP) address along with the request, so the server knows where to send the response. If you're connected by modem, your IP address changes every time you disconnect and reconnect; so without a detailed time log from your ISP, showing who was dialed in on which modem when, the IP address is useless (such logging is easy to do, but I tend to think most ISPs would not do it, and that those who do would be reluctant to give it out without a legal order to; but who knows?).

If, however, you're connected by cable, DSL, or a corporate high-speed link, your machine likely has a fixed IP address, which uniquely identifies you. If that number once gets associated with your personal information, then all that other information can be hooked up to -- and again the flood of junk mail (carefully targeted junk mail, too) begins.

What you do about this is up to you. You can refuse all cookies, but a few sites won't work without them. You can have your browser ask you before you accept any cookie, but that can be tedious (even having it ask "by site" as you can do in IE, is weak because ad sites use many slightly different sub-site servers, and because IE doesn't let you manage your cookies hierarchically by domain. For example, you can't exclude all cookies from ad-example.com and all sub-sites like machine94271.ad-example.com: you have to do each one individually.

Other than that, you can clear all cookies every so often (though you'll then have to re-set the useful ones).

Records on your computer

Cookies and other log information is scattered all over your machine. I've seen no real list of where various programs store it. Below I list some kinds of personal histories/information that I know of, and (when I know) where they're stored on Mac OS X. If you want to erase them, also remember that tossing them in the trash doesn't do much; emptying the trash and even formatting the disk don't do much more. Only re-writing the disk space repeatedly truly erases information. There are many programs that claim to do this for you (and probably really do, though I can't swear to it):