Search engine privacy
Search engine privacy is a subset of internet privacy that deals with user data being collected by search engines. Both types of privacy fall under the umbrella of information privacy. Privacy concerns regarding search engines can take many forms, such as the ability for search engines to log individual search queries, browsing history, IP addresses, and cookies of users, and conducting user profiling in general. The collection of personally identifiable information of users by search engines is referred to as tracking.
This is controversial because search engines often claim to collect a user's data in order to better tailor results to that specific user and to provide the user with a better searching experience. However, search engines can also abuse and compromise its users' privacy by selling their data to advertisers for profit. In the absence of regulations, users must decide what is more important to their search engine experience: relevance and speed of results or their privacy, and choose a search engine accordingly.
The legal framework in the United States for protecting user privacy is not very solid. The most popular search engines collect personal information, but other search engines that are focused on privacy have cropped up recently. There have been several well publicized breaches of search engine user privacy that occurred with companies like AOL and Yahoo. For individuals interested in preserving their privacy, there are options available to them, such as using software like Tor which makes the user's location and personal information anonymous or using a privacy focused search engine.
Privacy policies
Search engines generally publish privacy policies to inform users about what data of theirs may be collected and what purposes it may be used for. While these policies may be an attempt at transparency by search engines, many people never read them and are therefore unaware of how much of their private information, like passwords and saved files, are collected from cookies and may be logged and kept by the search engine. This ties in with the phenomenon of notice and consent, which is how many privacy policies are structured.Notice and consent policies essentially consist of a site showing the user a privacy policy and having them click to agree. This is intended to let the user freely decide whether or not to go ahead and use the website. This decision, however, may not actually be made so freely because the costs of opting out can be very high. Another big issue with putting the privacy policy in front of users and having them accept quickly is that they are often very hard to understand, even in the unlikely case that a user decides to read them. Privacy minded search engines, such as DuckDuckGo, state in their privacy policies that they collect much less data than search engines such as Google or Yahoo, and may not collect any. As of 2008, search engines were not in the business of selling user data to third parties, though they do note in their privacy policies that they comply with government subpoenas.
Google and Yahoo
Google, founded in 1998, is the most widely used search engine, receiving billions and billions of search queries every month. Google logs all search terms in a database along with the date and time of search, browser and operating system, IP address of user, the Google cookie, and the URL that shows the search engine and search query. The privacy policy of Google states that they pass user data on to various affiliates, subsidiaries, and "trusted" business partners.Yahoo, founded in 1994, also collects user data. It is a well-known fact that users do not read privacy policies, even for services that they use daily, such as Yahoo! Mail and Gmail. This persistent failure of consumers to read these privacy policies can be disadvantageous to them because while they may not pick up on differences in the language of privacy policies, judges in court cases certainly do. This means that search engine and email companies like Google and Yahoo are technically able to keep up the practice of targeting advertisements based on email content since they declare that they do so in their privacy policies. A study was done to see how much consumers cared about privacy policies of Google, specifically Gmail, and their detail, and it determined that users often thought that Google's practices were somewhat intrusive but that users would not often be willing to counteract this by paying a premium for their privacy.
DuckDuckGo
DuckDuckGo, founded in 2008, claims to be privacy focused. DuckDuckGo does not collect or share any personal information of users, such as IP addresses or cookies, which other search engines usually do log and keep for some time. It also does not have spam, and protects user privacy further by anonymizing search queries from the website the user chooses and using encryption. Similarly privacy oriented search engines include Startpage, Ecosia, Qwant, MetaGer and Disconnect. Mojeek and Brave Search are privacy-focused search engines that build their own indexes.Types of data collected by search engines
Most search engines can, and do, collect personal information about their users according to their own privacy policies. This user data could be anything from location information to cookies, IP addresses, search query histories, click-through history, and online fingerprints. This data is often stored in large databases, and users may be assigned numbers in an attempt to provide them with anonymity.Data can be stored for an extended period of time. For example, the data collected by Google on its users is retained for up to 9 months. Some studies state that this number is actually 18 months. This data is used for various reasons such as optimizing and personalizing search results for users, targeting advertising, and trying to protect users from scams and phishing attacks. Such data can be collected even when a user is not logged in to their account or when using a different IP address by using cookies.
Uses
User profiling and personalization
What search engines often do once they have collected information about a user's habits is to create a profile of them, which helps the search engine decide which links to show for different search queries submitted by that user or which ads to target them with. An interesting development in this field is the invention of automated learning, also known as machine learning. Using this, search engines can refine their profiling models to more accurately predict what any given user may want to click on by doing A/B testing of results offered to users and measuring the reactions of users.Companies like Google, Netflix, YouTube, and Amazon have all started personalizing results more and more. One notable example is how Google Scholar takes into account the publication history of a user in order to produce results it deems relevant. Personalization also occurs when Amazon recommends books or when IMDb suggests movies by using previously collected information about a user to predict their tastes. For personalization to occur, a user need not even be logged into their account.