strona główna | intranet | moje konto


Poszukiwanie partnerów do projektów w 7PR

Poszukiwanie partnerów

Inclusive Machine Translation for the EU

Inclusive-MT focuses on the role of language in the two-way 
interactions between technology and society, aiming to 
close a gap that excludes minority languages and 
morphologically complex languages. Inclusive-MT achieves 
this by implementing rule-based machine translation (MT) 
tools that are appropriate for populations in the EU currently 
underserved by prevailing statistical MT resources.

ICT 17- 2014: Cracking the language barrier  
UiT The Arctic University of Norway
Department of Language and Linguistics

Laura Janda
Proposal Outline: 
            Language is a crucial factor in human identity and 
culture, with fundamental implications for economic and 
civil security. People who speak languages overlooked by 
mainstream services face injustices in access to information 
and connectivity. Thus opportunities for social mobility, 
trade, and the building of mutual trust are lost. Inclusive-MT 
delivers tools that remove information-age inequities and 
thereby promotes fundamental democratic values in 
European society. We focus primarily on the Barents and 
Baltic regions where the challenges are acute, although 
other parts of Europe are also in our purview.
            Inclusive-MT is a multidisciplinary research project 
that delivers seamless machine translation services for 
underserved populations in the European digital market. 
Linguists, computer scientists, programmers, and SMEs 
collaborate to study the use of language resources and 
provide machine translation coverage extendable to EU 
languages, regardless of number of speakers, grammatical 
structure, and lexical complexity. Inclusive-MT builds on the 
ground-breaking successes of Giellatekno 
(, the Saami Language 
Technology Center at the Arctic University of Norway and its 
partners at: the University of Tartu (Estonia), University of 
Helsinki (Finland), and the University of Alacant (Spain). 
Partner SMEs can include: Morphologic (Estonia), Prompsit 
(Spain), and Kaldera (Norway). This project places special 
focus on minority languages such as the Saami languages 
and morphologically complex languages that consequently 
have “weak or no” machine translation support (cf. META-net 
language white papers), such as Estonian, Finnish, and 
            The “small” languages of the EU are particularly 
poorly served by current MT systems since the training data 
they require cannot be feasibly obtained and the 
grammatical structures of minority languages are often 
highly complex and radically different from English and 
other benchmark languages of such systems. Inclusive-MT 
thus has an ideal testing ground for developing translation 
systems that overcome the challenges of extreme linguistic 
differences and small, underserved populations.
            Inclusive-MT serves language pairs that are not 
represented in the Europarl matrix, breaking ground in two 
dimensions: a) by including Russian we provide MT for a 
major neighbor to the EU, and b) by providing MT for 
morphologically complex languages, we move away from 
the West European bias of using English as a hub language. 
Examples of the types of linguistic challenges that Inclusive-
MT solves include radical differences in the structure of 
gender, aspect, case, and verbal agreement. We represent 
minority and morphologically complex languages in their 
own terms and provide our tools free of charge, thus leveling 
the playing field for all users regardless of size, linguistic 
features, or economic resources.
            Inclusive-MT studies language resource behaviors, 
particularly on mobile devices, and strategically targets 
usage domains in implementation, such as:
            Use of social media in both private and corporate 
communication and adaptation of MT tools for these 
            Civil status of Russians living in Estonia and other 
EU countries;
            Visibility and status for the Saami languages in 
Norway, Sweden, Finland, and Russia.
            The solutions provided by Inclusive-MT are portable 
to any language, regardless of its morphological and 
syntactic complexity and divergence from lingua franca 
languages like English. We tackle typical translation 
obstacles such as compounding, word order, and differences 
in terms of analytic vs. synthetic packaging of meaning. 
Rule-based MT does not demand huge parallel corpora as a 
prerequisite, thus removing a limitation that has kept 
minority languages shut out of the machine translation 
market. Thanks to the plasticity of this project and a 
commitment to provide open-source and open-access 
products, Inclusive-MT can play a major role in protecting 
the linguistic heritage and rights of EU citizens and their 
global neighbors.
            Inclusive-MT is a value-added translation system 
that additionally supports language users, learners, and 
researchers. Because rule-based machine translation 
undertakes grammatical and lexical analysis, these can 
serve as input to electronic dictionaries and learning 
modules. Lemmatized electronic dictionaries are essential 
for languages with complex morphology, where the 
inflected form of a word may differ radically from its 
dictionary heading; for example, Giellatekno’s online 
dictionary of North Saami can locate a noun from input of 
any of its 130 inflected forms. The linguistic analysis of a 
rule-based system can feed Intelligent Computer Assisted 
Language Learning (ICALL) resources to support real human 
communication across language borders. Multipurposing 
thus makes rule-based MT the most efficient choice for 
integration of digital and live communication. Furthermore, 
this analysis can be used by linguists to extract significant 
trends from language corpora, thus strengthening language 
            Inclusive-MT is not a proof-of-concept project. It 
continues the trajectory of success laid out by its partners 
that have already developed functional bidirectional MT 
tools for Norwegian (both Bokmål and Nynorsk) and North 
Saami and has parallel tools under development for 
Estonian, Finnish, and Russian. The track record of 
Giellatekno and its partners guarantees that results will be 
achieved, yielding robust MT systems that service the 
languages targeted by this project.

User empowerment
rule-based machine translation

Required skills and Expertise: 

We are open to partnerships with experts in both rule-based 
and statistical machine translation.
It is also possible to partake in more general projects 
involving equity of access and human identity in relation to 
connectivity and communication.

Description of work to be carried out by the partner(s) sought: 

Research comparing access and use of machine translation 
among minority vs. majority populations.
Research on use of language and its relation to identity and 
Research on linguistic differences between minority and 
majority languages in Europe.

Type of partner(s) sought: 
Both Academic and SMEs, provided there is a commitment 
to open-source and open access of products.

Looking for a Coordinator for your proposal: 

Kontakt .:. Warunki korzystania z serwisu
Wersja polska English Version