Organizing your data
Scraping, methods and tools for business intelligence
Practical sheet REF: FIC1275 V1
Organizing your data
Scraping, methods and tools for business intelligence

Author : David COMMARMOND

Publication date: August 10, 2024 | Lire en français

Logo Techniques de l'Ingenieur You do not have access to this resource.
Request your free trial access! Free trial

Already subscribed?

2. Organizing your data

To organize your data, you need to understand the notions of structured, unstructured and semi-structured. Data homogeneity is imperative, and this is where the notion of structuring comes into play.

Aggregating data from several sources brings us face to face with the notions of formats, Mac, PC, Linux. In this field, there are thousands of formats that are more or less compatible with each other, depending on the publisher, and whether they are proprietary or royalty-free.

In addition, each user's input must be taken into account: for example, Paris, PARIS, paris, apris, are all variants in the creation of a CITY, Ville, VILLE_France or VILLE_FRANCE or PAYS_VILLE heading, as are numerical inputs such as 0.1; 1.0; 1.

The difficulty comes with the desire to automate the process, when updating data means constantly...

You do not have access to this resource.
Logo Techniques de l'Ingenieur

Exclusive to subscribers. 97% yet to be discovered!

You do not have access to this resource. Click here to request your free trial access!

Already subscribed?


Article included in this offer

"Management and innovation engineering"

( 434 articles )

Complete knowledge base

Updated and enriched with articles validated by our scientific committees

Services

A set of exclusive tools to complement the resources

View offer details
Contact us