New language tech tools developed for Slovenian language
Ljubljana, 7 December - New language technology tools for the Slovenian language such as machine translation for the Slovenian-English pair, speech-to-text, and various semantic technology applications have been developed as part of a project designed to equip Slovenian language for the digital age.
These are just some of the results of the Development of Slovene in a Digital Environment, a EUR 3.3 million project that has involved 120 researchers, which was presented in Ljubljana on Wednesday.
The tools will be available for test use in the coming months. The entire source code and databases will be publicly available under an open source license, and all apps will be available on the public portal of the project.
Ljubljana University Rector Gregor Majdič said that as the world was digitising at a rapid pace, linguistics needed well-functioning technological tools, which required extensive and good language resources.
For this purpose, the eleven partners in this demanding project - universities, institutes, state institutions and companies - have developed a multitude of tools, online services and language resources, he added.
Simon Krek from the Centre for Language Resources and Technologies noted that the project, which started in May and will officially conclude in February 2023, was also important for companies and the general public.
Krek also presented an overview of the situation in language resources and technologies in Europe as part of the European Language Equality project, which shows that Slovenia lags far behind many European countries.
In terms of various technological indicators, only the Latvian, Lithuanian, Croatian, Slovak, Irish and Maltese languages rank lower than Slovenian, he said, noting that shortage of financing was one of the main reasons.
The other reasons are a relatively small number of language technology companies, as well as education, since there was no study programme in this field until this year, when a digital linguistics programme was introduced.
Krek also noted that open data was a challenge during the project. "It turned out that an obstacle for having something that is open is that we simply don't know who the owner is," he said, calling for appropriate legislative changes.
He added that the solutions developed in the project needed to be maintained, that further development should be planned, and that the project should be transformed into a programme that will provide further funding.
One of the partners in the project was the Slovenian Press Agency (STA), which mostly acts as an evaluator of the products and has provided some of the input data.