In the world of data warehousing, there are various approaches to transforming and storing data. One traditional approach is to use a dimensional store, also known as a spatial store, in the form of a star schema or snowflake schema. This concept was extensively explained by Ralph Kimball in his book “The Data Warehouse Toolkit”.
Another approach is to build an operational data store (ODS) using a third normal form data model. This concept was introduced by Inmon in his book “Building Operational Data Store”. The ODS serves as a temporary storage area for data before it is loaded into either the data warehouse or the data mart.
Some data warehouse implementations utilize the ODS as a middle layer. This ODS follows the third normal form, but unlike the original concept, it contains all the historical data. This approach is particularly useful when dealing with multiple business systems. The ODS acts as a unified source of data, consolidating information from different systems and feeding it into the data warehouse.
For example, let’s consider three systems: system A, system B, and system C. All three systems have customer tables. In the ODS, these tables are overlaid on top of each other, making it easier to feed the data warehouse. In practice, the ODS can even become a new integrated business system, initially functioning as a read-only system but easily developed to handle updates.
The data warehouse contains data for the entire enterprise, covering various subject areas such as finance, operations, logistics, human resources, marketing, strategic planning, sales, and purchasing. On the other hand, data marts focus on specific subject areas, such as marketing. It’s important to note that the level of granularity in the data marts includes both aggregated data and detailed data, similar to the data warehouse.
If an enterprise has multiple data marts, the ODS becomes particularly useful. For instance, a finance mart and an operational mart can both draw data from the ODS, while the enterprise warehouse also relies on the ODS. Similarly, in the case of a group of companies with subsidiaries in different industries, each company can have its own warehouse built directly from a group-wide ODS. This approach ensures performance, consistency, and flexibility.
Having an ODS as a middle layer offers several advantages. Firstly, it provides a layer of protection between the data warehouse and the source system. This makes it easier to rebuild the warehouse, whether partially or entirely, as all the necessary information is maintained in the ODS. Additionally, the ODS allows for flexibility in terms of data history. By changing a single parameter in the control system, it’s possible to build a warehouse with a longer data history when needed.
However, there are trade-offs to consider. The presence of an ODS increases the data warehouse batch window, requiring more code to maintain. It also leads to increased disk space requirements, as the size of the ODS is typically similar to that of the enterprise data warehouse.
Whether or not an ODS is worth implementing depends on various factors, such as the size of the warehouse, the number of business systems involved, the need for integration, and the number of data marts to produce and maintain. For smaller warehouses with only a few tables and dimensions, and no data marts or integration requirements, an ODS may not be necessary. However, for larger warehouses with multiple source systems, data marts, and complex dimensions, an ODS can provide significant benefits.
It’s important to note that adding an ODS layer later can be a costly endeavor. Repositioning the data firewall, key generation, and metadata are just a few of the tasks involved. Therefore, the decision to include an ODS as a middle layer should be made before building the warehouse system.
Understanding the concepts of data warehousing, including the role of an ODS, is crucial for designing and implementing efficient SQL Server solutions. By carefully considering the specific requirements and trade-offs, organizations can build robust and scalable data warehousing systems that meet their business needs.
Author: Your Name
Date: Current Date