What is a Data warehouse?
Paths can be many but Destination is One. With this I mean to say, Different people have different explanations for a Data warehouse. But all the explanations and definitions aims to single destination i.e. Data storage. If you are thinking of aligning your career in BI/DW (Business Intelligence/Data warehousing), than you definitely need to know two names. I call these two persons as “Gods of BIDW”.
First name is Ralph Kimball and second name is Bill Inmon. You must be wondering why we need to know about these two personalities. This is because they have given some important Data warehouse concepts which need to be followed.
- Ralph Kimball provided a “Bottom – Up approach”, whereas
- Bill Inmon provided “Top - Down approach”.
In the world of BIDW, the most popular definition came from Bill Inmon, who said “A Data warehouse is a Subject-oriented, Integrated, Time-variant and Non-volatile collection of Data in support of Management’s decision making process”.
- Subject-Oriented : It means a Data warehouse can be used to analyze a specific subject area. For example - In an Organization, Subject area can be HR, Finance, Sales, etc. So, A Data warehouse can be used to analyze particularly HR or Sales.
- Integrated : With Integrated we mean data from heterogeneous data sources is integrated in Data warehouse. For example - Source A stores date in different format and source B stores date in other format, but in a Data warehouse, dates will be stored in a single format only.
- Time-Variant : Data warehouse only deals with historical data. It acts as a pit where Transactional data is dumped every day. We can retrieve data which is 6 months, 12 months old or even more than this from a data warehouse. For example – You hold a SBI bank account and updated your contact details. So, Transaction system can only provide you the current contact details whereas Data warehouse holds all the contact details associated with your account (Current and Previous both).
- Non-volatile : As explained above, Data warehouse is freakily mad about historical data. Once data comes in the data warehouse, it will not alter. Hence, historical data in a data warehouse is Non-volatile.
Another concept came from Ralph Kimball who explained Data warehouse in his own way and said “A data warehouse is a copy of transactional data specifically structured for query and analysis”.
This is a more like a functional view of a Data warehouse. Bill Inmon precisely stated how the Data warehouse is built which was lacking in Ralph Kimball’s definition.
Data warehouse usages -
- Trend analyzing for an organization.
- Startegic planning.
- Product forecasting.
- Designing business models.
- Preparing dashboards and reports with reporting tools.
Summary - In this post, we have learnt
- What do we mean by Data warehouse.
- Definition provided by Bill Inmon.
- Definition provided by Ralph Kimball.
- How Transactional database is different from Data warehouse.
- Data warehouse usages.
I hope you liked my article on “What is a Data warehouse?”. Comments and queries are most welcome below.