question

xnl28 avatar image
xnl28 asked

Fact table question

Just want some advice from people about what fields to include in a fact table in a data warehouse schema. Should the fact table contain the dimension values, or just the dimension foreign keys? For example, my fact table contains CurrencyId which is a foreign key into the Currency dimension. Should the fact table also contain a field for CurrencyCode and CurrencyName?
data-warehouse
10 |1200 characters needed characters left characters exceeded

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

1 Answer

· Write an Answer
Pavel Pawlowski avatar image
Pavel Pawlowski answered
No, the fact table should contains keys to the dimension tables and then measures. Some times fact table can contains values, which are not keys to dimension tables (like degenerated dimensions etc). You can read some book about the dimensional modelling. eg. [The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling][1] by Ralph Kimball [1]: http://www.amazon.com/gp/product/B001KZGWEC
3 comments
10 |1200 characters needed characters left characters exceeded

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

In data warehouse design you have redundant data in the dimensions (I mean attributes in the dimensions). And dimensions are mostly relatively small. The biggest dimension have mostly maximum of a few millions of rows (there are exceptions) but fact table can have even billions of rows. Related to fact table.. When you will be reporting, you will not include in the report all rows from the multi-million records fact table, but you will most probably filter out data by some dimensions etc. Especially SQL Server 2008 Enterprise version has a [star join query optimization][1] for such purposes. [1]: http://msdn.microsoft.com/en-us/library/cc278097(v=sql.100).aspx
1 Like 1 ·
If the fact table contains just the dimension foreign keys and not the dimension values, then any code reporting from the fact table will need to join to all the dimensions. Won't this adversely affect performance? I thought that one of the concepts of a data warehouse schema to have lots of redundant data in the fact table so that these joins are not needed. Or have I got this mixed up with something else...?
0 Likes 0 ·
In a star-schema DWH, you have redundant data by design because the DB is not normalized. If you're querying the DWH and find the queries too slow, one of the options (besides good indexes) is to build an OLAP cube on top of your DWH and use that as source for reporting.
0 Likes 0 ·

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.