Overview
ABSTRACT
This paper presents an overview of the state of the art in natural language processing, exploring one specific computational architecture, the Transformer model, which plays a central role in a wide range of applications. This architecture condenses many advances in neural learning methods and can be exploited in many ways: to learn representations for linguistic entities; to generate coherent utterances and answer questions; to perform utterance transformations, an illustration being machine translation; to serve as backbone of versatile chatbots, able to answer queries and to perform, at will, a large variety of tasks. These facets of the architecture will be successively presented, which will also allow us to discuss its limitations.
Read this article from a comprehensive knowledge base, updated and supplemented with articles reviewed by scientific committees.
Read the articleAUTHOR
-
François YVON: Research Director, Sorbonne University, CNRS, ISIR, France
INTRODUCTION
Language technologies feature prominently among the applications of artificial intelligence (AI) and are now reaching the general public. They are essential for effectively accessing textual information available on the web or in large document databases. They enable new forms of interaction with machines, whether through voice commands or via tools that assist with typing or writing. They also facilitate communication with other humans, for example through machine translation systems. Behind the scenes, these algorithms organize and filter the vast amount of text and audio recordings circulating on the web and social media. In this way, they are transforming how this data is managed.
This transition has accelerated as these technologies have become more powerful, enabling an ever-wider and more diverse range of applications. These advances stem from a combination of several factors: on the one hand, the development of increasingly sophisticated machine learning algorithms capable of leveraging improvements in computing hardware; on the other hand, the ability to access vast amounts of textual data—whether annotated or unannotated—to enable this learning.
Among these algorithms, neural networks—particularly the Transformer architecture—take center stage. This architecture has become essential for performing three types of processing that previously required dedicated components. First, text mining and information retrieval algorithms benefit from the rich internal representations computed by this model. Second, linguistic analysis algorithms leverage the Transformers’ ability to account for very long-range dependencies. Finally, text generation algorithms use these models, primarily for their predictive capabilities.
Furthermore, this architecture is well-suited for processing spoken and even multimodal data, and enables efficient large-scale computations. It’s easy to see why this model has become the language engineer’s go-to tool.
Exclusive to subscribers. 97% yet to be discovered!
Already subscribed? Log in!
KEYWORDS
natural language processing | machine learning | Language Models | Neural Machine Translation
Transform: Neural Networks for Natural Language Processing
Article included in this offer
"Digital documents and content management"
(
75 articles
)
Updated and enriched with articles validated by our scientific committees
A set of exclusive tools to complement the resources
Bibliography
Exclusive to subscribers. 97% yet to be discovered!
Already subscribed? Log in!