Transform: Neural Networks for Natural Language Processing

Overview

ABSTRACT

This paper presents an overview of the state of the art in natural language processing, exploring one specific computational architecture, the Transformer model, which plays a central role in a wide range of applications. This architecture condenses many advances in neural learning methods and can be exploited in many ways: to learn representations for linguistic entities; to generate coherent utterances and answer questions; to perform utterance transformations, an illustration being machine translation; to serve as backbone of versatile chatbots, able to answer queries and to perform, at will, a large variety of tasks. These facets of the architecture will be successively presented, which will also allow us to discuss its limitations.

Read this article from a comprehensive knowledge base, updated and supplemented with articles reviewed by scientific committees.

Read the article

AUTHOR

François YVON: Research Director, Sorbonne University, CNRS, ISIR, France

INTRODUCTION

Language technologies feature prominently among the applications of artificial intelligence (AI) and are now reaching the general public. They are essential for effectively accessing textual information available on the web or in large document databases. They enable new forms of interaction with machines, whether through voice commands or via tools that assist with typing or writing. They also facilitate communication with other humans, for example through machine translation systems. Behind the scenes, these algorithms organize and filter the vast amount of text and audio recordings circulating on the web and social media. In this way, they are transforming how this data is managed.

This transition has accelerated as these technologies have become more powerful, enabling an ever-wider and more diverse range of applications. These advances stem from a combination of several factors: on the one hand, the development of increasingly sophisticated machine learning algorithms capable of leveraging improvements in computing hardware; on the other hand, the ability to access vast amounts of textual data—whether annotated or unannotated—to enable this learning.

Among these algorithms, neural networks—particularly the Transformer architecture—take center stage. This architecture has become essential for performing three types of processing that previously required dedicated components. First, text mining and information retrieval algorithms benefit from the rich internal representations computed by this model. Second, linguistic analysis algorithms leverage the Transformers’ ability to account for very long-range dependencies. Finally, text generation algorithms use these models, primarily for their predictive capabilities.

Furthermore, this architecture is well-suited for processing spoken and even multimodal data, and enables efficient large-scale computations. It’s easy to see why this model has become the language engineer’s go-to tool.

You do not have access to this resource.

Exclusive to subscribers. 97% yet to be discovered!

You do not have access to this resource. Click here to request your free trial access!

Already subscribed? Log in!

KEYWORDS

natural language processing | machine learning | Language Models | Neural Machine Translation

CAN BE ALSO FOUND IN:

Ongoing reading
Transform: Neural Networks for Natural Language Processing

Typewriters: Language Models

Article included in this offer

"Digital documents and content management"

( 75 articles )

Complete knowledge base

Updated and enriched with articles validated by our scientific committees

Services

A set of exclusive tools to complement the resources

View offer details

Bibliography

(1) - JELINEK (F.) - Statistical methods for speech recognition. - The MIT Press (1997).
(2) - JELINEK (F.), MERCER (M.) - Interpolated estimation of Markov source parameters from sparse data. - Proceedings of the workshop on pattern recognition...

You do not have access to this resource.

Exclusive to subscribers. 97% yet to be discovered!

You do not have access to this resource. Click here to request your free trial access!

Already subscribed? Log in!