3. Data cleansing
This operation is based on a postulate, often summed up as "garbage in, garbage out": if the elements are dirty on entry, the result can only be dirty on exit. Cleaning operations are essential and can be significant. They can be carried out by "mills", processing operations that automatically correct newly collected, processed and integrated data. This is where the talents of humans and, more recently, artificial intelligence come into play, as well as the author's ability to carry out this work.
For the most part, cleaning methods are strongly correlated with technologies and technical developments. Here's a brief overview.
The very "Web 1.0" antediluvian method for Internet users, based mainly on existing site sources, consisted in ad hoc retrieval of data from the Web (copy-paste), page source code or text interpreted by the browser,...
Exclusive to subscribers. 97% yet to be discovered!
Already subscribed? Log in!
Data cleansing
Article included in this offer
"Management and innovation engineering"
(
434 articles
)
Updated and enriched with articles validated by our scientific committees
A set of exclusive tools to complement the resources
Bibliography
Also in our database
Exclusive to subscribers. 97% yet to be discovered!
Already subscribed? Log in!