This recent NLP paper from the Stanford NLP group presents an original approach to the Natural Language Inference problem, ie, the task of determining whether a “hypothesis” is true (entailment), false (contradiction), or undetermined (neutral) given a “premise”.
The paper introduces a new dataset called SCI for “Stanford Corpus of Implicatives”, for which each sentence can be associated with a piece of metadata referred to as the “signature”. The signature of an implicative indicates the relation between the main clause and the complement clause. It is often inferred from a single verb in the sentence, like manage and fail, and other times from phrasal constructions like meet one’s duty or waste a chance.
A signature is composed of two symbols, each symbol being +, -, o depending on the entailment relation. The first symbol corresponds to the sentence entailment in a positive environment, whereas the second corresponds to the sentence entailment in a negative environment. Here are two examples:
The SCI dataset complements the existing MULTINLI corpus, a crowd-sourced collection of 433k sentence pairs with textual entailment annotations, which is also used for experiments.
The newly introduced RRN models claim to be more modular and adaptable, in the sense that they can learn a variety of specialized modules that play different – or even orthogonal – roles into solving the inference task.
Thursday, September 26th at 9:30am PST