Data warehouse definition - What is it?

Data Warehouse definition 1
Data Warehouse definition 1

Dernière mise à jour: 20 mai 2023

The term Data Warehouse translates into French as “Data warehouse”. Like a real warehouse, the Data Warehouse stores terabytes of data functional.

This data is collected, processed and ordered in a data warehouse. They can then be crossed, cut, analyzed, peeled ... this is called the Data mining.

Its purpose is D'help in decision making. Obviously, these amounts of data can only be stored on powerful computer systems.

The quality of a Data Warehouse depends primarily on the quality of the data contained therein, their reliability and their consistency. Essential points for drawing the most relevant lessons.

Differences between a database and a Data Warehouse

Confusion between the two terms is frequent as the two concepts are related.

Having said that, the database is generally used for a specific function of the company (customer service, accounting, purchasing, human resources, etc.) while the Data Warehouse allows you to analyze all this information at the same time.

Le Data Warehouse is the concentration of all databases into a single database. 

Another difference is that databases are theoretically optimized for quick reading. At a glance, the information can be read and decrypted.

On the other hand, the Data Warehouse will store this data in an aggregated manner, requiring a first analysis before being able to decrypt the information therein.

The databases of each business of the company are therefore at the service of the Data Warehouse and feed them.

The main characteristics of the Data Warehouse

Bill Inmon

Bill Inmon, the founder of the concept described the Data Warehouse as “not just a copy of production data. the Data Warehouse is organized and structured".

  • Subject oriented
    By structure, we mean here that the data is “subject” oriented, that is to say ordered by business function (marketing, accounting, etc.) or by theme, such as sales.
  • Integrated
    Second characteristic, the data is integrated. Although they come from several sources and are often not structured in the same way, the data is standardized to be stored in the Data Warehouse. Significant standardization work is required in order to subsequently enable relevant and faster analysis.
  • Nonvolatile
    Third characteristic, the information stored is “non-volatile”. All information is traceable and does not change over the course of treatment. These same non-volatile data are dated and historized. Each new data is inserted without deleting old data, hence the importance of associating the information with a date. Everything is identifiable over time.

Concretely, what is the Data Warehouse used for?

Unlike operational systems, the Data Warehouse allows the analysis of the activity of the company over thousands of records sometimes cross-checked with other information.

This enables businesses to improve decision making by making queries to examine their customers' processes, performance and trends.

Concrete example :

With Data Warehouse, we can analyze the impact of a 30% discount on loyal customers and on customers who make their first purchase. To go even further, we can differentiate the shops that had this product in the window against those that did not.

Information specific to several services is used here and will be used to answer several questions:

  • Does promotion boost sales?
  • Should it be accentuated?
  • Should it be accentuated only for loyal customers?
  • Should the product be displayed in all stores?

Decisions that will impact the operational services of merchandising, procurement, etc.