Textual entailment
In natural language processing, textual entailment, also known as natural language inference, is a directional relation between text fragments. The relation holds whenever the truth of one text fragment follows from another text.
Definition
In the TE framework, the entailing and entailed texts are termed text and hypothesis, respectively. Textual entailment is not the same as pure logical entailment – it has a more relaxed definition: "t entails h" if, typically, a human reading t would infer that h is most likely true. The relation is directional because even if "t entails h", the reverse "h entails t" is much less certain.Determining whether this relationship holds is an informal task, one which sometimes overlaps with the formal tasks of formal semantics ; additionally, textual entailment partially subsumes word entailment.
Examples
Textual entailment can be illustrated with examples of three different relations:An example of a positive TE is:
- text: If you help the needy, God will reward you.
- text: If you help the needy, God will reward you.
- text: If you help the needy, God will reward you.
Ambiguity of natural language
Approaches
Textual entailment measures natural language understanding as it asks for a semantic interpretation of the text, and due to its generality remains an active area of research. Many approaches and refinements of approaches have been considered, such as word embedding, logical models, graphical models, rule systems, contextual focusing, and machine learning. Practical or large-scale solutions avoid these complex methods and instead use only surface syntax or lexical relationships, but are correspondingly less accurate., state-of-the-art systems are far from human performance; a study found humans to agree on the dataset 95.25% of the time. Algorithms from 2016 had not yet achieved 90%.Applications
Many natural language processing applications, like question answering, information extraction, summarization, multi-document summarization, and evaluation of machine translation systems, need to recognize that a particular target meaning can be inferred from different text variants. Typically entailment is used as part of a larger system, for example in a prediction system to filter out trivial or obvious predictions. Textual entailment also has applications in adversarial stylometry, which has the objective of removing textual style without changing the overall meaning of communication.Datasets
Some of available English NLI datasets include:*
*
*
*
*
*
In addition, there are several non-English NLI datasets, as follows:
*
- for French
- for Farsi
- for Chinese
- for Dutch
- for Indonesian