A Stochastic Parts Program and Noun Phrase Parser for Unrestricted Text
A Stochastic Parts Program and Noun Phrase Parser for Unrestricted Text
Kenneth Ward Church
TLDR
A program that tags each word in an input sentence with the most likely part of speech has been written and performance is encouraging; a 400-word sample is presented and is judged to be 99.5% correct.
Samenvatting
A program that tags each word in an input sentence with the most likely part of speech has been written. The program uses a linear-time dynamic programming algorithm to find an assignment of parts of speech to words that optimizes the product of (a) lexical probabilities (probability of observing part of speech i given word i) and (b) contextual probabilities (probability of observing part of speech i given n following parts of speech). Program performance is encouraging; a 400-word sample is presented and is judged to be 99.5% correct.<
