3. Preparing data
It's an illusion to think that the data found in the various sources will be ready to use.
The analyst will have to apply several successive processes:
control of aberrations and extremes with possible suppression ;
data homogenization where possible (e.g., for numerical data to be converted or names to be standardized); see figure "Example of Open Refine's name homogenization processing". Open Refine is a tool that enables data to be reprocessed, standardized or deduplicated;
data structuring, for example using segmentation solutions (Web scrapping) or extraction of named entities;
data exploitation (statistical processing, spatialization,...
Exclusive to subscribers. 97% yet to be discovered!
Already subscribed? Log in!
Preparing data
Article included in this offer
"Management and innovation engineering"
(
434 articles
)
Updated and enriched with articles validated by our scientific committees
A set of exclusive tools to complement the resources
Bibliography
Also in our database
Exclusive to subscribers. 97% yet to be discovered!
Already subscribed? Log in!