adozark avatar image
adozark asked

What reason is there to build a star schema for data cubes?

I am new to building SSAS cubes, for my tests I just used a named query to join the data (I seem to get errors when creating the relations with the UI). I thought the disadvantage of querying a de-normalized database vs building a star schema data warehouse would be that it would be slow to process the cube, but the cube processed 2M records in 3 seconds. It surprised me, because I don't think I could see the query results that quick, even running it on the server itself. But I queried the cube, and all the data was there. So what is the advantage of the star schema database in relation to cubes? I feel like I'm missing something. Thank you for your help.
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

1 Answer

Pavel Pawlowski avatar image
Pavel Pawlowski answered
First 2M records in fact table is not too much for SSAS to process. It also depends how bigh your record is. If you have a fact table with 1 or 2 four bytes measures, than 2 millions as nothing. But if you have a fact table with for example 20 measures and links to another 20-30 dimensions, the situation will change. Alo to your 2 seconds.. Probably you have no aggregations defined. In case of snowflake design of dimenions, it again depends on the amount and structure of data inside the tables. In case there is a lot of records for attributes which will be used in higher lelves of hierarhies which are not referenced by attributes in lower level of hierarchies, you will end up with a lot of attributes in the higher level, which do not have relevat data. In case you make joins in the named queries and the the dimension will have a lot of attributes, than your joins will be executed a lot of time as by default SSAS is processing dimensions per attribute basis. So how many attributes you have, so many named queries will be sent to the source database engine. Proper star database schema not only helps analysis services, but it also helps in a lot of common DWH queries to perform much better as the database engines mostly contain special optimization for star schema queries and this is also the case of SQL Server Enterprise. And as mentioned above, 2M relativelly small records in DWH scenario is nothing. In case you have 200M, 500M or even more records, than you will see the benefits. Of course, there are cases, when it does not have sense to have everything in the star schema, but mostly large DWH scenarios will benefit from it. If you would like to learn a lot related to start schma and dimensional modelling, I would recommed you to read a great book [The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling (Second Edition)][1] by Ralph Kimball [1]:
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.