Tuesday, June 2, 2009

Search Engines: The Invader to Privacy

HOW SEARCH ENGINES MAY TRACK THE USERS AND THEIR PRIVACY

People often view search engines as benign blank boxes to which they can pose any question they want and not suffer the consequences. Search engines large and small typically keep logs of users search terms, with some search engines going further and matching those terms to users computer address, name, and other items, depending on how much information they have shared with the search engine. When a web surfer first visits a Search Engine, it will most likely log the IP address of the computer being used, the date and time of the access, and probably the browser configuration. If available, referrer information may also be logged, for if the user arrived at the Search Engine page by clicking on a link in some other web page, then this referrer data will contain the URL of this previously visited page. This analysis can yield important information for them, such as the sites that better feed traffic to their sites and the approximate regional location of the visitors.

However, the major difference between regular websites and Search Engines is that the latter have the possibility to keep track of all the searches that the user formulates during a visit to the site. All of the entered keywords can be traced, and even the links in the results page that were ultimately accessed by the user can be logged. This type of information collection comes off as very natural since Search Engines are merely keeping track of the service they are offering to the users, but it is one of the first practices carried out by Search Engine sites which is a matter of privacy concerned. Cookies appear simple enough on the surface. They are actually nothing more than small text files used to keep some information on the client computer. The main purpose of Search Engines when placing persistent cookies (which remain stored in the user's hard drive until erased or expired) is not to trace the search habits of their users per se, but rather to have a way of individualizing the preferences of those accessing the site, in order to provide a more personalized experience to returning users. However, it is possible to utilize cookies in a way that might eventually compromise the privacy of the users. Since all information stored in cookies is handled under the hood by the sites that originally created them, it can be hard to tell exactly what they are being used for. Still, cookies provide great power to Search Engine sites as far as recording any type of information about their users and even keeping those tracking activities secret through the use of careful encoding and encryption.

The first serious privacy breach by a search engine arose in August 2006. AOL accidentally published search queries from more than half a million users, made over a three-month period, on its website. By the time AOL realised its mistake, the information had been copied to a number of other Internet sites, which are still available today. In 2005, the U.S. Department of Justice subpoenaed Google, Yahoo, MSN, and AOL for tens of millions of users' search queries. Google successfully fought the request, and was able to limit its disclosure, but it is unknown how much data other companies may have turned over.

Like AOL, Google also stores users' search terms. This is done in the form of a server log. Each search creates a new entry in the log.

Structure of Google's search logs



The server log information has greater potential to identify the individual than the information disclosed by AOL. It not only contains search terms but also traffic data, such as the IP address of the user and the user's Google cookie. This information could, in theory, be used to actually identify, rather than just guess the identity of, the user. Google collects additional information about users if they use other services such as Google Mail, Talk and Calendar. Of these, Google Mail has attracted the most controversy because of its use of “content extraction”. This analyses incoming and outgoing e-mail in order to target the advertising to the user while they are using the Google Mail service. Google continues to use this technique on Google Mail accounts despite numerous privacy complaints from various organizations such as the Electronic Privacy Information Centre.

DoubleClick
The privacy concerns surrounding Google have been heightened recently by its plans to acquire DoubleClick, Inc., one of the leading providers of Internet advertising. In order to help its clients target the advertising it also allows users to be tracked as they visit different websites across the Internet. This is done by placing a DoubleClick cookie on the user's computer the first time they visit a site with a DoubleClick advert. This cookie is then read each time the user visits another website containing a DoubleClick advert, thus building up a picture of the user's surfing habits.

Google’s My Search History
A new feature launched by Google allows users to see all of their past searches. The service, called My Search History, is similar to, but more comprehensive than, the feature Amazon.com, Ask, and America Online have offered for some time. It is intended to help people who use Google locate the information they sought during earlier searches so they can avoid repeating past queries. Once a user has set up the account, he or she will be able to see the search words previously entered as well as the sites visited previously that contain information on that search term. This may sound interesting and useful, but computer experts said there are risks to privacy the technology has now generated. By this, a user allowing Google to store search history on their computers and as long as Google holds up its end of the privacy policy, that information should remain safely on its servers. It is a universal truth that all search engines, including Yahoo, Google and MSN, retain search data of their users which can easily give a clue about the person's identity and a glimpse into his mind and online activity.

SOME REMEDIAL MEASURES TO SEARCH ENGINE PRIVACY
Individuals concerned with privacy but wishing to use the valuable services of search engines can adopt a number of measures to safeguard their personal confidentiality. Before type a search terms into a search engine box or register for extra services at a search engine, it is necessary to be aware of the potential consequences. Searches can come back to haunt, especially if they are problematic and can be tied directly to users in some way. Here are some tips to help to enhance Web searching privacy ranging from high protection steps to simple steps we can start to take right away.

Watch what you search for: By avoid using terms that include full legal name attached to any information. For example, searching for users’ full name and ID card number in a query is not optimal. If these types of search have conducted, then the users name and ID will appear together in the search string, and may be stored for a long time by the search engine.

Special note about passwords and user names: By avoid typing passwords or user names into search engines. If there is a security breach that allows users data to be released to others, these passwords and user names can potentially be used to identify the surfer, or even potentially cause some mischief. Therefore, it is a good idea to change passwords and user names, if the passwords or user names has been entered into a search engine.

Considering using an anonymizing tool or a proxy: The simplest way to disassociate from search terms is to use an anonymizing tool. There are free services available that allow using the Web without revealing users computer address, and there are also pay services. We may not realize it, but it is true that a computer disclose a lot of information as we traverse the Web. For example, login through IP ID at http://ipid.shat.net, we can verify someone computer or IP address and the kind of information that computer is disclosing.
TOR Onion Router and Privoxy: TOR (http://tor.eff.org) is a free tool that can be install in combination with a tool called Privoxy (http://www.privoxy.org/), which helps to mask yours computer's address, among other things. TOR and Privoxy are a good tool set and are well worth considering. These two tools should be used together.

Use Scandoo - Scandoo is a wonderful wrapper written around search engines that warns you of malicious websites in search results. Scandoo can help you search Google, Yahoo or MSN without disclosing your actual geographic location or IP address to the search engine. Scandoo interface remains invisible to the end user.

Download HideMyIp software - Your IP address is one big link between your search queries. If you are using a static IP, you can still hide it with HideMyIP address software. HideMyIP conceals your real IP address and shows a fake IP with a hostname to the sites that you visit. You can set Hide-My-IP to change your IP address every minute.

Download CustomizeGoogle for Firefox - If you Google using Firefox, this is a highly recommended extension that completely enhances your Googling experience. It can help remove Google Ads, anonymize your Google user ID, remove click tracking or filter Google search results.

Disallow Google to Store Cookies - The important thing is that it doesn't suffice blocking cookies from just google.com domain; you must also block cookies from google site in your country. For example, in India, one would block google.com and google.co.in. This is because Google redirects you to your local country page when you type in google.com in the browser address bar. To block cookies, open the Cookie blocking dialog in your
browser, type the site URL and click disallow or block.


SOME GENERAL TIPS FOR USING SEARCH ENGINES

These following tips are small steps that will not completely protect us from all search engine privacy issues, but they can potentially help to make incremental improvements.

• Do not accept search engine cookies. If you already have some on your computer, delete them. Cookies can be used to correlate a variety of information.

• Do not sign up for email at the same search engine where you regularly search. If you do so, then your email address can potentially be tied to your search terms. Whether or not a search engine does this is usually disclosed in the search engine's privacy policy.

• If you surf using a cable modem, or a static Internet connection, ask your service provider to give you a new IP address. Changing IP addresses every once in a while can be helpful for people who primarily surf the Web from one computer in one location over a long period of time.

No comments: