Requirement analysis and conceptual modeling of real-time data warehouses

Data warehouses have been proven useful in making strategic business decisions by enabling storage of large amount of data, possibly from heterogeneous and distributed sources, into a unified system. The data kept in the data warehouse is historic mainly because of the batch process that is used...

Full description

Bibliographic Details
Main Author: Aakriti Agarwal
Other Authors: Vivekanand Gopalkrishnan
Format: Final Year Project (FYP)
Language:English
Published: 2009
Subjects:
Online Access:http://hdl.handle.net/10356/17044
Description
Summary:Data warehouses have been proven useful in making strategic business decisions by enabling storage of large amount of data, possibly from heterogeneous and distributed sources, into a unified system. The data kept in the data warehouse is historic mainly because of the batch process that is used to update the data. However, several business intelligence applications now require access to real time data, in addition to historic data. This leads to the need for a real time data warehouse. But questions arise as to how we can achieve real time data warehousing and if the current approaches address the requirements of a real time data warehouse. We look at real time systems and real time databases, examine their nature and their issues, and synthesize the requirements that are applicable to real time data warehousing. By comparing and combining, we have come up with a list of requirements a real time data warehouse must satisfy. A real time data warehouse requires quality of service (QoS), quality of data (QoD), and some application-based requirements. For each requirement, we discuss what component of the data warehouse can be used to address the requirement. As we discuss each component, we also present an overview of the state of the art. With the requirements, current approaches, and limitations known, we highlight what areas need further investigation to make real time data warehousing a reality. Our study targets researchers who would like to contribute to the advancement of real time data warehousing, and data warehouse administrators of business intelligence systems that need to be made real time. The second half of the author's work focuses on conceptual modeling for real-time data warehouses. A lot of work has been presented in the area of conceptual modeling for designing data warehouses. However, in spite of the advent of real-time data warehouses, not much work can be seen with regards to modeling of these systems. In this context, we propose a conceptual modeling approach RealMER (based on Entity-Relationship), to address the peculiarities associated with this kind of systems. We firstly propose a set of criteria for classification of good conceptual models for real-time data warehouses. Then, we define new graphical elements to cover the real-time paradigm and highlight the advantage capturing such information has on the downstream processes of the warehouse. Through a case study and comparative analysis, we prove the effectiveness of our technique over those of our peers.