On 7 November at 12:00 Maksym Del will defend his thesis "Multilingual and multi-domain representational patterns across transformer-based models“ to obtain the degree of Doctor of Philosophy (in Computer Science).
Supervisor:
Professor Mark Fišel, University of Tartu
Opponents:
Professor Anders Søgaard (University of Copenhagen, Denmark)
Associate Professor Mathias Creutz (University of Helsinki, Finland)
Summary
Artificial Intelligence (AI) models often work like mysterious black boxes: they take data and generate predictions, but their internal workings are hidden. Interpreting these AI networks is akin to exploring the workings of a complex biological or alien brain. This lack of transparency makes it hard to trust these models, as we cannot be sure they are safe, fair, or reliable. For example, a model that works well in one language might fail in another.
Our research focuses on making AI models more understandable, focusing on multilingual and multi-domain models. We discovered two key phenomena in Transformer-based models: multilingual abstraction, where models learn to convert sentences to a "shared mental language" independently of whether the input is in Estonian or English, and multi-domain specialization, where models learn to dedicate separate tools for each domain inside. These patterns were consistent across various models and datasets.
While our main aim is to provide insights into the inner workings of multilingual and multi-domain models, we also introduce a new methodology for interpreting multilingual models and present a practical application to improve multi-domain machine translation. We hope that these insights can assist in enhancing the safety, fairness, or accessibility of AI technology, especially for underrepresented languages and domains.