In settings where there is a generality-ordering on hypotheses, it is possible to represent the version space by two sets of hypotheses: the most specific consistent hypotheses, and the most general consistent hypotheses, where "consistent" indicates agreement with observed data. The most specific hypotheses cover the observed positive training examples, and as little of the remaining feature space as possible. These hypotheses, if reduced any further, exclude a positive training example, and hence become inconsistent. These minimal hypotheses essentially constitute a claim that the true concept is defined just by the positive data already observed: Thus, if a novel data point is observed, it should be assumed to be negative. The most general hypotheses cover the observed positive training examples, but also cover as much of the remaining feature space without including any negative training examples. These, if enlarged any further, include a negative training example, and hence become inconsistent. These maximal hypotheses essentially constitute a claim that the true concept is defined just by the negative data already observed: Thus, if a novel data point is observed, it should be assumed to be positive. Thus, during learning, the version space can be represented by just its lower and upper bounds, and learning operations can be performed just on these representative sets. After learning, classification can be performed on unseen examples by testing the hypothesis learned by the algorithm. If the example is consistent with multiple hypotheses, a majority voterule can be applied.
Historical background
The notion of version spaces was introduced by Mitchell in the early 1980s as a framework for understanding the basic problem of supervised learning within the context of solution search. Although the basic "candidate elimination" search method that accompanies the version space framework is not a popular learning algorithm, there are some practical implementations that have been developed. A major drawback of version space learning is its inability to deal with noise: any pair of inconsistent examples can cause the version space to collapse, i.e., become empty, so that classification becomes impossible. One solution of this problem is proposed by Dubois and Quafafou that proposed the Rough Version Space, where rough sets based approximations are used to learn certain and possible hypothesis in the presence of inconsistent data.