[ EuroLangNet 21+3 ] European network 21+3 for HLT suport of 
Multilingual Knowledge Based Processes 

ICT 17- 2014: Cracking the language barrier  
Slovak University of Technology
The pilot and robust network will create a shared platform for 
multilingual resource and tools applications, and for benchmarking 
activities tailored on MT&LR area within multilingual knowledge based 
processes in the field of materials sciences and technology, industrial 
testing, technical standardization, occupation health and safety, 
environmental protection and general R&D topics. In context with the 
MT&LR area, the focus will be on investigation how to transfer and 
integrate the content of global and EU multilingual datasets and 
national corpora into personalized knowledge based processes, which 
are daily performed by European individuals to be sustainable and 
competitive within their jobs. This is planned to achieve via monitoring 
the state-of-the art, best practice exchange, assistance with 
multidisciplinary MT research, testing and evaluation of existing 
personalized solutions, and their adaptation to individual users; and 
this all within the close synergy with major multilingual European 
platforms, networks and national policies and programs of the 
Consortium members. 

An individual focus will be on the developing categories of set of 
keywords which would enable "switching" between all 24 European 
languages via batch knowledge Internet retrieving and advanced 
multilingual search in the various categories of knowledge based 
processes, that require automatic or half-automatic MT&LR support, 
when running on (i) personal cloud, (ii) desktop client computers and 
(iii) university's server.
 This, will be made with intention to solve the existing absence of 
interoperability in technical and no-technical fields (as it is natural for 
individuals), as well, regarding harmonization in the framework of 
multidisciplinary research. 

A special focus will be on the testing new paradigm of multilingual 
support based on a prepared patent application, including 
implementing of in-house developed personal software tool. This will 
enable to integrate basic elements of human language technologies 
into knowledge based processes , such as natural language 
processing, speech technology, machine translation, information and 
knowledge processing. This bottom up integration of human language 
technology into the individuals' processes, especially in relation to the 
implementing higher quality of automatic MT, should give synergic 
added value to human computer interaction, or higher level of 
knowledge economy in general. This also affects the solving issues, 
such as exploring if even the use of existing European datasets and 
corpora is suitable for individuals when managing multilingual 
knowledge in real technical practice within global market conditions 
(e.g. this requires researchers to explore their compatibility with 
technical standards terminology, patent classification systems, etc.).

Proposal Outline: 

The title 21+3 means that network will cover 21 referenced languages 
+ 3 languages with high level of MT (due to the best practice and 
experience exchange), i.e. our intention is to have min. 21+3 
participants in the project Consortium. The concrete idea is to work 
in "multi-pairs", as is described in the following text.
 From the Slovak language point of view, all common knowledge 
based processes are in principle multilingual knowledge based in the 
view of the global or European market condition (see http://www.meta- Moreover, from the ICT point 
of view, these processes are uncertain and work with unstructured 
data. According to our findings in the research on technology-
enhanced learning (including activities for FP7 Consortiums 
KEPLER /2007 and L3Pulse /2013), the automation such processes 
requires a parallel solving of three autonomous categories: (i) the 
modelling processes - to be computerizable, (ii) the development of 
tools and applications - here automatic machine translation, knowledge 
processing in natural language, text to speech, speech reckognition, 
and (iii) the automation of work on desktop client computers and 
networks (clouds, servers) - for instance, adaptation to operating 
system, data transfer, conversion of text-, image- and multi-medial 
formats, etc. In addition, one should consider that in real practice all 
multilingual knowledge based processes consist from sequences of 
sub-processes / steps, however only a part of them requires automatic 
or half-automatic MT. Thus, a combination of these issues mentioned 
should be implemented into workpackages structure when planning 
any project.

In the view of above mentioned, we consider the following potential 
workpackage structure as a benchmark background or part of the 
research or innovation project (e.g. as an autonomous workpackage), 
with focus on automatic and half-automatic translation within 
multilingual knowledge based processes:

WP1 State-of -the-art in MT&LR / HLT (Human Language 
• Exploring existing multilingual processes in global market (categories, 
performers, application areas)
• Exploring disposable multilingual applications/services, datasets, 
national corpora and European databases and their suitability for 
• Evaluation - selection of main processes (which should be selected 
for solving in the workpackages) 

WP2 Modelling / modifying processes using multilingual knowledge to 
be suitable for automation
• Multilingual knowledge base design in relation to domain content
• Unification and design of processes to be computerisable
• Evaluation of multilingual resources, knowledge base and processes 
for following computerisation (prevailing via MT)

WP3 Case studies/Modelling/Testing Informatics Tools and 
Applications for MT&LR (HLT in general)
• Cloud computing applications practice
• Combined off-line and online applications (testing the batch 
knowledge processing paradigm)
• Recommended system for suitable large/big datasets
• Design of knowledge sets in multi-formats for self-regulated processes
 (note: combined text - image - audio - video formats)
• Exploring suitability of MT, text-to-speech, speech reckognition and 
emerging technologies regarding personal processes
• Testing existing / developed MT-solution within automation of 
processes running on personal computers, clouds and networks 

WP4 Automation of Multilingual Knowledge Based Processes in 
Natural Language (MT&LR)
• Testing computerisation of unified processes and domain content
• Implementing cloud - and server based multilingual applications
• HTL resources: performing multilingual WEB-monitoring, advanced 
search and retrieving, including testing Internet services
• Transfer of scientific heritage via multilingual approach into education 
and training
• Comparison of suitability and quality level of MT, text-to-speech, 
speech reckognition and emerging technologies 

WP5 Multilingual Benchmarking Portal
• Design of cooperation portal
• benchmarking, communication, best practices and information 
• results dissemination 

WP6 Project management 

Our ICT 17 project idea is based on 1 + 23 approach, i.e. on automatic 
MT from Slovak language to all EU languages and back (similarly e.g. 
automatic MT from Cz to 23 languages), including an intention to 
transform positive experiences from the three major MT languages 
(En, Fr, Es).
 We are at the beginning development of a multilingual "Switcher" 
which will switch between one European language to other 23 
language. This enables us to solve the automation of multilingual 
based processes, thus, automatic machine translation will be a sub-
part of the overall automation. This will be focused on mentioned 
technical fields (material science and technology, technical standards 
and industrial testing, environmental protection, occupational safety 
and health, R&D activities,...). This supposes three categories of 
selected multilingual resources for automatic MT: (i) European and 
global technical databases, datasets and repositories, e.g. patents and 
standards databases, chemical databases, etc.), (ii) official multilingual 
datasets like CEF and national corpora, maybe also so called "Big 
Data" in connection with ICT 15 call results, (iii) general EU no-
technical sources as is CORDIS or (e.g. nowadays, 
environmental laws are already at disposal in all EU languages). Of 
course, there is a large scale of other mono- or bilingual sources at 

Note: Example of a concrete action - One example of use of 
the "Switcher" for knowledge based process: "Proposing invention and 
patent application". 

This process requires sequences of sub-processes to be automated, 
e.g. individual searching on patents in European database Espacenet 
offers more than 80 million document --- external retrievals are made 
parallel by National Patent Office or any scientific-technical information 
center from some word patent databases --- commonly hundreds of 
patents abstract in English is the first result, thus, a Slovak researcher 
must reading and evaluating them in English language and compare if 
his invention is "new" (very high cognitive load) --- at the end he must 
write report in English, as well, if the patent application is dedicated for 
EU it must be translated, etc. This is one of the simplest example how 
could be the "Switcher" used. More complicated case are when 
switching from Slovak Language to 23 EU languages and when 
searching in e.g. ten significant recommended EU datasets, corpora 
and internet databases (result is 23 x 10 = 230 hits). This example 
explains a very significant case, that automatic MT for individual user 
must be strictly tailor made. Because if not, the number of hits would 
be extremely high and not suitable for common use. Therefore a 
system of keywords must be developed for each MT application area. 
Thus, personal support is radical different if any researcher group 
develops solution like a national corpora and so on. It requires another 
ICT approach. This example as well demonstrates, that in real practice 
one must investigate if the existing multilingual European datasets, 
corpora and databases are suitable for sharing them within automatic 
MT, respectively it must be explored how these sources and 
repositories could be used for the personalized automatic MT.

Contact person: Stefan Svetsky +421 949 541835

Human Language Technologies
automatic translation
technology-enhanced learning
Digital content
learning analytics
speech recognition
text processing
knowledge mining
knowledge processing
Human Computer Interaction

Required skills and Expertise: 

[Project leader]

Skills and Expertize with project management, including scientific - 
financial - risk - IPR - gender management, and all appropriated issues 
according to the Horizon 2020 manuals. 

Skills and Expertise in the field of Human Language Technologies with 
target on MT&LR, who is able to lead and coordinate ICT 17 => CSA 
or the multidisciplinary research project or innovation project (where 
the robust EuroLangNet 1+23 network could assist or our team, e.g. 
via leading a workpackage in context with the subject and project 

[Partners for consortium members]

Min. one partner from each EU country who has any skills and 
expertizes for supporting network activities according to the subject 
and project description.

Description of work to be carried out by the partner(s) sought: 

Work to be carried out by the partners is described in subject and 
project description and depends on the manuscript of workpackage 
structure. Thus, each partner should find his place or perform similar 
activities. It is also related to previous partners activities, conditions 
and references, e.g. see our case:

We have experiences mostly with CSA (FP7 to FP5) in relation to 
scientific-technical topics, especially with multilingual issues within 
knowledge based processes performed in global industrial market, as 
well in educational area (within a didactic driven technology-enhanced 
learning we had to implement several IT disciplines). Also, we work in 
national standardization ISO/CEN committees or ICT society, in which 
terminological language issues are solved (En / De / Fr to Sk) , 
including having skills with translation ISO/CEN standards into Slovak 
standards in engineering field (e.g. corrosion protection). 

We are able to perform solving and leading individual workpackages or 
topics within a research / innovation project or CSA. However we have 
not capacities and skills for writing complet proposals and project 
management. On the other hand, we need HLT suport, e.g. automatic 
MT within daily performed multilingual knowledge based processes. 
For this purpose, we have developed an infrastructure, tools, in-house 
software and tested all categories of HLT in these processes. For 
instance we cover knowledge processing in natural language, but we 
found that actual speech technologies are not yet suitable for common 
personal use (e.g. as a Slovak I have very – very low succes in speech 
reckognition in spoken English, therefore I do not use Nuance software 
at all ).

Concerning MT I have developed a simple support based on thesaurus 
when writing scientific papers, or we use Google translator and Systran 
(, however it has no Slovak translation. To make simple 
translation word by word is not problem, it is only a question of bi-
language datasets, however why should we developed it because 
many local or global solutions exist?

Type of partner(s) sought: 

R&D institutes, universities, high-tech companies, international 
consortiums, etc. which are focusing on human language technologies, 
especially on automatic machine translation with interest in 
implementing this into multilingual knowledge based processes 
(technical, educational, research, innovative, market activities and 
mental processes of experts and common users in general).

Despite of the fact that this EoI is written for CSA we are looking for 
leader - coordinator who has intention to write for ICT 17:
 a) research project
 b) innovation project
 c) CSA.

Looking for a Coordinator for your proposal: 

