Powered By Blogger

Monday 27 February 2012

Star & Snowflake Schema


Star schema
A star schema is the simplest form of a dimensional model, in which data is organized into facts and dimensions.  A fact is an event that is counted or measured, such as a sale.  A dimension contains reference information about the fact, such as date, product, or customer. 
For example, a simple Sales fact with millions of individual clothing sale records might contain a Product Key, Promotion Key, Customer Key, and Date Key, along with Units Sold and Revenue measures.  The Product dimension would hold reference information such as product name, description, size, and color.  The Promotion dimension would hold information such as promotion name and price.  The Customer dimension would hold information such as first and last name, birth date, gender, address, etc.  The Date dimension would include calendar date, week of year, month, quarter, year, etc. 

The star schema supports rapid aggregations (such as count, sum, and average) of many fact records, and these aggregations can be easily filtered and grouped (“sliced & diced”) by the dimensions. 
 A star schema may be partially normalized (snowflake), with related information stored in multiple related dimension tables, to support specific data warehousing needs.

Snowflake schema
In data warehousing, snow flaking is a form of dimensional modeling in which dimensions are stored in multiple related dimension tables.
Snow flaking is used to improve the performance of certain queries.  The schema is diagramed with each fact surrounded by its associated dimensions (as in a star schema), and those dimensions are further related to other dimensions, branching out into a snowflake pattern.
Cases for snow flaking includes:
Sparsely populated attributes, where most dimension member records have a NULL value for the attribute, are moved to a sub-dimension.
Low cardinality attributes that are queried independently.  For example, a product dimension may contain thousands of products, but only a handful of product types.  Moving the product type attribute to its own dimension table can improve performance when the product types are queried independently.
Attributes that are part of a hierarchy and are queried independently.  Examples include the year, quarter, and month attributes of
                   Star schema
                       Snowflake Schema
1. De-Normalized Data Structure

2. Category wise Single Dimension Table

3. More data dependency and redundancy

4. No need to use complicated join

5. Query Results Faster

6. No Parent Table

7. Simple DB Structure


1. Normalized Data Structure

2. Dimension table split into many pieces

3. less data dependency and No redundancy

4. Complicated Join

5. Some delay in Query Processing

6. It May contain Parent Table

7. Complicated DB Structure


Star schema: In this star schema fact table in normalized format and dimension table is in de normalized format.  It also known as basic star schema.
Snow flake schema: In this both dimension and fact table is in normalized format only. It is also known as extended star schema.
If u r taking the snow flake it requires more dimensions, more foreign keys, and it will reduce the query performance, but it normalizes the records.

No comments: