On 9 April at 11:00 Hele-Andra Kuulmets will defend her doctoral thesis "Cross-Lingual Transfer Learning and Evaluation in Low-Resource Settings" to obtain the degree of Doctor of Philosophy (in Computer Science).
Supervisor:
Prof. Mark Fišel, University of Tartu (Estonia)
Opponents:
Prof. Barbara Plank, Ludwig-Maximilians-Universität München (Germany)
Dr. Jindřich Helcl, University of Oslo (Norway)
Summary:
One of the key parts of artificial intelligence, which is becoming more and more common in our daily lives, is the language model — the part that enables AI to understand the subtle nuances of human language. For a language model to understand a language well, it must be trained on massive collections of text. These collections need to be so large that for most of the world’s languages, such volumes of text simply do not exist in written form. As a result, AI systems remain limited in many languages, which in turn can lead to a decline in the use of those languages. Fortunately, there is a solution for languages with smaller text resources. It has been observed that when language models are trained on many different languages at the same time, they also become better at understanding the languages that are less represented in the training data. This phenomenon is called cross-lingual knowledge transfer, and it means that the model learns to use knowledge it has learned, for example, from English texts, when responding in Estonian. On a more technical level, multilingual training causes the mathematical representations of different languages inside the model to become more similar, which is what enables knowledge to be transferred between languages. This doctoral thesis investigates how to strengthen cross-lingual transfer in language models with the goal of improving their ability to understand Estonian. The work is divided into two parts: the first examines methods for using smaller language models to solve specific tasks, and the second looks at how to teach Estonian to a large language model that was originally trained mainly on English texts. The main conclusion is that multilingual training, even when using only synthetic data, can significantly improve a model’s ability to perform various tasks in Estonian. This finding points to the effectiveness of skilfully applying cross-lingual transfer to ensure better representation of smaller languages in the world of AI.
The defence will be held also in Zoom (Meeting ID: 993 6373 0402, passcode: ati).