Transformer: neural networks for automatic language processing

Overview

ABSTRACT

This paper presents an overview of the state of the art in natural language processing, exploring one specific computational architecture, the Transformer model, which plays a central role in a wide range of applications. This architecture condenses many advances in neural learning methods and can be exploited in many ways: to learn representations for linguistic entities; to generate coherent utterances and answer questions; to perform utterance transformations, an illustration being their automatic translation. These different facets of the architecture will be successively presented, which will also allow us to discuss its limitations.

Read this article from a comprehensive knowledge base, updated and supplemented with articles reviewed by scientific committees.

Read the article

AUTHOR

François YVON: Research Director, Université Paris-Saclay, CNRS, LISN, France

INTRODUCTION

Linguistic technologies figure prominently among the applications of artificial intelligence (AI), and are now reaching the general public. They are essential for efficiently accessing textual information available on the Web or in large documentary databases; they enable new forms of interaction with the machine, by voice or through input or writing aids; they help us communicate with other humans, for example through machine translation systems; in a more underground way, these algorithms structure, organize, filter, select, transform and make it possible to manage the mounds of text and audio recordings that continually circulate on the web or on social networks.

This transition has accelerated as these technologies have become progressively more powerful for ever wider and more varied uses. This progress is the result of a combination of factors: on the one hand, the development of increasingly sophisticated machine learning algorithms, capable of taking advantage of improved computing hardware; on the other, the possibility of accessing very large masses of textual data, annotated or unannotated, to carry out this learning. Among algorithms, neural algorithms, and in particular the Transformer architecture, are at the forefront. This architecture has in fact become central to three types of processing that previously required dedicated architectures: firstly, text mining and information retrieval algorithms, which benefit from the richness of the internal representations computed by this model; secondly, linguistic analysis algorithms, which take advantage of Transformers' ability to take into account very long-distance dependencies; and finally, text generation algorithms, which use these models primarily for their predictive capacity. Add to this the fact that this same architecture also lends itself to the processing of oral and even multimodal data, and that it enables efficient calculations on a very large scale, and it's easy to see why this model has become the Swiss Army Knife of the linguistic engineer.

Key points

Domain: Transformers for automatic language and speech processing

Degree of technology diffusion: Growth

Technologies involved : Machine learning, neural networks

Applications: Machine translation, information retrieval, dialogue systems, voice transcription, etc.

Main French players :

Competence centers: INRIA Centre de Paris, Grenoble Computer Science Laboratory (Grenoble Alpes University and CNRS), Interdisciplinary Digital Science Laboratory (Paris Saclay University and CNRS), LIP6 (Sorbonne University and CNRS), Computer Science and Systems Laboratory...

You do not have access to this resource.

Exclusive to subscribers. 97% yet to be discovered!

You do not have access to this resource. Click here to request your free trial access!

Already subscribed? Log in!

KEYWORDS

natural language processing | machine learning | Language Models | Neural Machine Translation

CAN BE ALSO FOUND IN:

Ongoing reading
Transformer: neural networks for automatic language processing

Typewriters: language models

Article included in this offer

"Software technologies and System architectures"

( 232 articles )

Complete knowledge base

Updated and enriched with articles validated by our scientific committees

Services

A set of exclusive tools to complement the resources

View offer details

Bibliography

(1) - AHARONI (R.), JOHNSON (M.), FIRAT (O.) - Massively multilingual neural machine translation. - Proceedings of the 2019 conference of the north American chapter of the association for computational linguistics : Human language technologies, volume 1 (long and short papers), Association for Computational Linguistics, p. 3874-3884 (2019)....

You do not have access to this resource.

Exclusive to subscribers. 97% yet to be discovered!

You do not have access to this resource. Click here to request your free trial access!

Already subscribed? Log in!