Inductive biases for learning natural language - PhDData

Access database of worldwide thesis




Inductive biases for learning natural language

The thesis was published by Abnar, S., in January 2023, University of Amsterdam.

Abstract:

A fascinating long-standing question about human intelligence is understanding the learning biases that enable natural language processing. At the same time, building machine learning models for language presents a significant challenge in identifying necessary or useful inductive biases that facilitate efficient and generalizable learning. On the one hand, we need to identify these inductive biases, and on the other hand, we need to incorporate them into the models. To advance toward this goal, we introduce methods and design experiments to study the inductive biases of neural network language models. We explore various techniques to illustrate the effect of inductive biases on different aspects of the solutions these models converge to. Our findings include: (1) Different choices in designing neural networks lead to solutions with different characteristics. Although some factors such as training objectives and connectivity patterns may lead to more divergent solutions, other factors such as scaling model size may have less impact on the final solutions. (2) Recurrence plays a significant role in facilitating learning structures that are needed to solve language tasks more similarly to the human brain. We identify and empirically evaluate different sources of inductive biases in recurrent neural networks, including sequentiality, memory bottleneck, and parameter sharing over time. (3) The process of distilling knowledge from one model to another sheds light on the difference in inductive biases and expressivity of the models and knowledge distillation can transfer some of the effects of the inductive biases of the teacher model to the student model.



Read the last PhD tips