Difference between Fact Tables and Dimension Tables in SAP HANA
Category: SAP HANA Posted:May 12, 2017 By: Ashley MorrisonWith the outburst of digital marketing and the arrival of the Internet, the potential for establishing and accessing data has grown exponentially. In the recent competitive business environment, companies of all sizes have functions in various places within their native and other regions of the world. Each branch office may create large volumes of the data on a regular basis, and business decision makers need access to these data resources. This situation demands the feature of the data warehouse. The data warehouse is one of the essential business intelligence devices that a business requires to have. It transforms the abundant amount of data produced from various resources into a pattern, which is simple to understand. There is no word to explore the importance of data warehouses in the IT infrastructure and data architecture of the organization.
Data warehouses are made using the dimensional data model that includes fact and dimension tables. In order to become confident in the field of a data warehouse, it is essential to have a better understanding of the fact tables and dimension tables. Here in this article, we are going to discuss the essential things about the fact tables and dimension tables and the differentiation of these both tables. The relationship between the fact and dimension table is considered as one to many. For example, if you want to be aware of a number of resources involved in a task, the fact table will keep the actual measure of the resources, while the dimension table will keep the resource and task details.
Dimension Tables
Dimension tables are generally used to define dimensions; they include the values, dimension keys as well as attributes. For instance, the time dimension would include every year, quarter, month, week, date, and hour that has happened since you began the operation of your business. The product dimension would include a name as well as the product dimension you trade. In addition, they also include its color, weight, unit price, and other attributes as related.
Typically, dimension tables are small in size, ranging from a few to numerous thousand rows. Frequently, dimension tables can enhance largely. However, just consider this example; a huge credit card agency could include a customer dimension along with several thousands of rows. The structure of the dimension table is generally lean. For instance, look at the following customer dimension:
- Customer Key
- Customer Full name
- Customer City
- Customer State
- Customer Country
However, there might be some other related attributes, which you keep in the relation database; there is no need to keep all those attributes in the data warehouse. For instance, customer contact numbers, customer mail addresses, and other details wouldn’t be essential for the data warehouse. The vital thing that should note here is that the data warehouse is meant to take strategic decisions by evaluating trends. It doesn’t happen to be a device for regular business operations. However, there are some reports, for which you do add data elements, which are not essential for data analysis.
Most of the data warehouses will include one or more time dimensions. Because the data warehouse will be utilized for determining and evaluating tendencies, data analysis will require one to aware of when each detail has happened. For instance, the calendar dimension is the most collective time dimension. On the other hand, your business also requires an economic time dimension if your economic year doesn’t begin on 1st January as the calendar time.
Most of the data warehouse will also include service or product dimensions; because each business typically functions by providing either services or products to others. Geographically isolated businesses are most possibly requiring having the location dimension.
Dimension table includes the fields, which are used to define the data in the tables. Moreover, the dimension table can offer descriptive and additional details or dimensions of the fact table field.
Dimensions are grouped by which précised data can be observed. For example, a profit summary in the fact table can be observed by a Region dimension (i.e. profit by city, state, country), Time dimension (i.e. profit by year, quarter, month), Product dimension (i.e. profit for product 1, product 2).
Learn SAP HANA from Industry Experts
Types of Dimensions
The dimension tables are grouped into nine types. They are
- Slowly Changing Dimensions
- Junk Dimensions
- Rapidly Changing Dimensions
- Inferred Dimensions
- Degenerate Dimensions
- Conformed Dimensions
- Shrunken Dimensions
- Role-Playing Dimensions
- Static Dimensions
1. Slowly Changing Dimensions
Generally, dimension attributes would undergo changes over a certain period of time. It is based on the business requirement, whether the change in the history of certain attributes should be conserved in the warehouse. This attribute is referred to as a slowly changing attribute. The dimension, including such attributes, is known as a slowly changing dimension.
2. Junk Dimensions
A single table that includes a compilation of unrelated and different attributes is referred to as the junk dimension table. This table is meant to avoid including a huge amount of foreign keys in the fact table. This table is often generated to handle foreign keys generated by the rapidly changing dimensions.
3. Rapidly Changing Dimensions
An attribute in the dimension, which is changing frequently, is referred to as the rapidly changing attribute. The dimension, including such attributes, is known as a rapidly changing dimension.
4. Inferred Dimensions
Sometimes, dimension records might not be ready while loading the fact records. A solution to handle this situation is creating a surrogate key with a null value for the entire other attributes. This process is technically referred to as the inferred member or inferred dimension.
5. Degenerate Dimensions
This dimension table is when the attribute of the dimension is placed as a portion of the fact table, instead of in a separate dimension table. In a warehouse, these dimension tables are utilized because of a drill via a query in order to evaluate the aggregated number source in a report.
6. Conformed Dimensions
A dimension, which is utilized in various locations, is referred as the conformed dimension. This dimension may be utilized with several fact tables in a seamless database or data warehouse or multiple data marts.
7. Shrunken Dimensions
The shrunken dimension table is referred to as the subset of alternative dimensions.
8. Role Playing Dimensions
A role-playing dimension comes where the similar dimension key with its relevant attributes can be merged to multiple foreign keys in a fact table.
9. Static Dimensions
The static dimensions aren’t mined from the original source of data; however, they are generated within the context of a warehouse. It is possible to load a static table manually. For instance, with status codes. It also can be created by a procedure like time or date dimension.
Fact Tables
Fact tables include keys to the dimension tables and measurable facts, which Data Analyst would need to evaluate. For instance, a store trading automotive parts should have a fact table tracking each item sale. Similarly, an educational entity includes a fact table to track the awarded credit hours to the students. The bakery entity could have the fact table to track the manufacturing details of the various baked goods.
The fact tables can extend very large, even billions of rows. It is essential to determine the lowest level of fact tables, which makes it logical to evaluate for your business and this purpose is often known as the fact table grain. For example, consider a healthcare billing company, it could be enough to record revenues by month; hourly and daily data might not available or might not be applicable. However, the assembly line data warehouse analysts could be very conscious of the number of defective goods, which were produced each hour. Likewise, a marketing warehouse could be alarmed by the consumer group activity with a certain income-level instead of purchase made by the individual.
The fact Table includes facts and measures. The data present in the fact table is typically numerical in nature.
The fact table includes précised historical and numerical data or facts and a compound index, which is a collection of foreign keys from the primary keys of associated dimension tables.
Register for live webinar on SAP HANA by Industry Experts
Types of Fact Tables
The entire fact tables are grouped by three main measurement events. They are
- Transactional
- Periodic Snapshots
- Accumulating Snapshots
1. Transactional
The transactional type of fact table is considered the most basic table. Each grain linked to this table is specified as, “one row for one line in the transaction.” An example of this table is every line item, which seems on an invoice. The transaction table includes the most detailed level of data; hence, it includes a huge amount of dimension associated with it.
2. Periodic Snapshots
The periodic snapshots type of fact table keeps the data, which is a snapshot present in a period. The data present in the periodic snapshot fact table is sourced from the transaction fact table.
3. Accumulating Snapshots
The accumulating snapshots type of fact table defines a business process activity, which includes a clear starting as well as the end. This kind of fact tables; hence, includes a multitude of data columns to specify process milestones. An example of this accumulating snapshot type is material processing. Once the steps towards managing the materials are completed, the matching record present in the accumulating snapshot table becomes updated.
The connection between the Dimension tables and Fact tables
If you want to have a better understanding of these tables, it is essential to know about the link between the two tables.
The facts and measures of the data warehouse, which is categorized and described by the dimension table, provide meaningful solutions to the business issues. They practice the essentials of dimensional modeling.
Fact tables with keys reference the dimension tables. When making a dimension table in the data warehouse, a row in the dimension table is identified by a system-generated key. This key is referred to as the surrogate key. This key serves as the primary key in the dimension table. This surrogate key is stored in the fact table. In addition, a foreign key is defined between the dimension table and the fact table. When it comes to data join, it happens like any other sort of join in the database.
Business measure or facts, as well as foreign keys, are preserved in the fact tables that are considered as candidate keys in the dimension tables. Usually, fact tables provide the additive values that serve as the independent variables. Moreover, the dimensional attributes are evaluated by those independent variables. Attributes, which are utilized for group and constrain data for queries of data warehousing, are preserved in the dimension tables. A dimension in the data warehouse is the gathering of reference details regarding the measurable event, these events are referred to as facts, and they are retained in a fact table.