In data warehouse technology, a multiple dimensional view can be implemented by
ID: 3727824 • Letter: I
Question
In data warehouse technology, a multiple dimensional view can be implemented by a relational database technique (ROLAP), by a multidimensional database technique (MOLAP), or by a hybrid database technique (HOLAP).
a. Briefly describe each implementation technique.
b. For each technique, explain how each of the following functions may be implemented:
1. The generation of a data warehouse (including aggregation)
2. Roll-up
3. Drill-down
4. Incremental updating
c. Which implementation techniques do you prefer, and why?
Explanation / Answer
OLAP :
Online Analytical Processing is a class of tools which can exract and present multi-dimensional data from different points of view. OLAP structures the data in a heirarchal manner. OLAP functions include trend analysis, drilling-down, summarization of data and data rotation.
There are three types of OLAP:
1. ROLAP
2. MOLAP
3. HOLAP
1. ROLAP ( Relational OLAP servers):
These servers are found between relational back-end server and client front-end tools. They make use of RDBMSs or extended RDBMSs to store and manage warehouse data and uses OLAP middleware to fill in the gaps. ROLAP servers have the ability to optimize the all back end DBMSs, as well as deployment of aggregation navigation logic and other tools.
i. The generation of data warehouse(including aggregation)
Initial Aggregation is achieved using SQL via group-bys. The cube operator aggregates all subsets of the dimensions in the specified operation which generates a single cube. ROLAP depends on tuples and relational tables as its basic data structures. The fact table stores data at the abstraction level indicated by join keys in the schema for the given data cube. Aggregated data is also stored in fact tables. To optimize ROLAP cube computation we use techniques like sorting, hashing, grouping. Grouping is performed on sub-aggregates. Sub-aggregates are the aggregates that are derived from previously computed aggregates.
ii. Roll-up:
The roll-up operation performs aggregation on a data cube, either by climbing up a concept hierarchy for a dimension or by dimension reduction such that one or more dimensions are removed from the given cube.
iii. drill-down:
Drill-down is the reverse of roll-up. It navigates from less detailed data to more detailed data. Drill-down can be realized by either stepping down a concept hierarchy for a dimension or introducing new dimensions.
iv. Incremental updating:
Data warehouse implementation can be broken down into segments or increments. An increment is a defined data warehouse implementation project that has a specified beginning and end. Incremental data capture is a time-dependent model for capturing changes to operational systems. This technique is best applied in circumstances where changes in the data are significantly smaller than the size of the data set for a specific period of time.These techniques are more complex than static capture, because they are closely tied to the DBMS or the operational software which updates the DBMS.
2. MOLAP(Multi-dimensional OLAP servers) :
These servers allow for multidimensional views of data through array-based multidimensional engines. The first generation of server-based multidimensional OLAP (MOLAP) solutions use multidimensional databases (MDDBs). The main advantage of an MDDB over an RDBMS is that an MDDB can provide information quickly since it is calculated and stored at the appropriate hierarchy level in advance.
i. The generation of data warehouse(including aggregation)
MOLAP uses array structures to store data for OLAP. Initial aggregation is done using SQL via group-bys. MOLAP follows very different cube computation scheme than ROLAP. It uses direct array addressing, where dimension values are accessed via the position or index of the corresponding array. The array based cube is generated by Partitioning array into chunks. A chunk is a subcube that is small enough to fit into the memory for cube computation.It computes aggregates by visiting cube cells.
ii. Roll-up:
The roll-up operation performs aggregation on a data cube, either by climbing up a concept hierarchy for a dimension or by dimension reduction such that one or more dimensions are removed from the given cube. Here we roll-up chunks that make sub-cubes and these sub-cubes make an array.
iii. drill-down:
Drill-down is the reverse of roll-up. It navigates from less detailed data to more detailed data. Drill-down can be realized by either stepping down a concept hierarchy for a dimension or introducing new dimensions.
iv.Incremental updating:
Data warehouse implementation can be broken down into segments or increments. An increment is a defined data warehouse implementation project that has a specified beginning and end. Incremental data capture is a time-dependent model for capturing changes to operational systems.The technique for MOLAP updates would be more sophisticated due to the additional complexity of arrays and subcubes.
3. HOLAP (Hybrid OLAP servers):
HOLAP combines elements from MOLAP and ROLAP. HOLAP keeps the original data in relational tables but stores aggregations in a multidimensional format. It combines MOLAP and ROLAP. It utilizes both pre-calculated cubes & relational data sources.
i. The generation of data warehouse(including aggregation)
The generation would consist of a combined approach of both MOLAP and ROLAP. A HOLAP would be generated to store large volumes of detail in a relational database while a MOLAP would be used to store aggregations separately.
ii. Roll-up:
It combines the roll-up operations of both ROLAP and MOLAP.
iii. drill-down:
It combines the drill-down operations of both ROLAP and MOLAP.
iv.Incremental updating:
Data warehouse implementation can be broken down into segments or increments. An increment is a defined data warehouse implementation project that has a specified beginning and end. Incremental data capture is a time-dependent model for capturing changes to operational systems.It updates both ROLAP DBMS updates as well as MOLAP updates of array and subcubes.
d. HOLAP implementation technique is mostly preferred because it is the combination of greater scalability of ROLAP and faster computation of MOLAP. HOLAP can store data either in RDBS or MDDBS. It also improves performance and manages the storage of data.
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.