Modeling rapid language learning by distilling Bayesian priors into artificial neural networks

TLDR

It is shown that learning from limited naturalistic data is possible with an approach that bridges the divide between two popular modeling traditions: Bayesian models and neural networks, which provides a single system that can learn rapidly and can handle naturalistic data.

要旨

Humans can learn languages from remarkably little experience. Developing computational models that explain this ability has been a major challenge in cognitive science. Existing approaches have been successful at explaining how humans generalize rapidly in controlled settings but are usually too restrictive to tractably handle naturalistic data. We show that learning from limited naturalistic data is possible with an approach that bridges the divide between two popular modeling traditions: Bayesian models and neural networks. This approach distills a Bayesian model’s inductive biases—the factors that guide generalization—into a neural network that has flexible representations. Like a Bayesian model, the resulting system can learn formal linguistic patterns from limited data. Like a neural network, it can also learn aspects of English syntax from naturally-occurring sentences. Thus, this model provides a single system that can learn rapidly and can handle naturalistic data. Children can learn language from very little experience, and explaining this ability has been a major challenge in cognitive science. Here, the authors combine the flexible representations of neural networks with the structured learning biases of Bayesian models to help explain rapid language learning.