Cluster hypothesis
In machine learning and information retrieval, the cluster hypothesis is an assumption about the nature of the data handled in those fields, which takes various forms. In information retrieval, it states that documents that are clustered together "behave similarly with respect to relevance to information needs". In terms of classification, it states that if points are in the same cluster, they are likely to be of the same class. There may be multiple clusters forming a single class.
Information retrieval
The cluster hypothesis was formulated first by van Rijsbergen: "closely associated documentstend to be relevant to the same requests". Thus, theoretically, a search engine could try to locate only the appropriate cluster for a query, and then allow users to browse through this cluster. Although experiments showed that the cluster hypothesis as such holds, exploiting it for retrieval did not lead to satisfying results.