ENRICH4ALL: A first Luxembourgish BERT Model for a Multilingual Chatbot

Auteurs

Anastasiou D., Ion R., Badea V., Pedretti O., Gratz P., Afkari H., Maquil V., Ruge A.

Référence

1st Annual Meeting of the ELRA/ISCA Special Interest Group on Under-Resourced Languages, SIGUL 2022 - held in conjunction with the International Conference on Language Resources and Evaluation, LREC 2022 - Proceedings, pp. 207-212, 2022

Description

Machine Translation (MT)-powered chatbots are not established yet, however, we see an amazing future breaking language barriers and enabling conversation in multiple languages without time-consuming language model building and training, particularly for under-resourced languages. In this paper we focus on the under-resourced Luxembourgish language. This article describes an experiment we have done with a dataset containing administrative questions that we have manually created to offer BERT QA capabilities to a multilingual chatbot. The chatbot supports visual dialog flow diagram creation (through an interface called BotStudio) in which a dialog node manages the user question at a specific step. Dialog nodes can be matched to the user's question by using a BERT classification model which labels the question with a dialog node label.

Partager cette page :