x

What is denormalization

What is denormalization and the benefits of doing so?

When would you denormalize a data model, and when would denormalization be a bad move?

more ▼

asked Oct 19 '09 at 05:14 PM in Default

Kev Riley gravatar image

Kev Riley ♦♦
50.7k 43 49 76

Um, isn't that when a computer geek's attempt to appear normal start to wear off? ;-)
Oct 23 '09 at 01:13 AM RBarryYoung
(comments are locked)
10|1200 characters needed characters left

3 answers: sort voted first

The obvious answer is that denormalization is the process of undoing the normalization done in a typical database design. So what does that mean? Well, it means allowing for redundant data to reduce the number of multi-table (or view) joins necessary.

Picture a normalized database as one with 4 small tables (customers, locations, products, customer_orders) joined together on appropriate keys. A denormalized version might have just 1 table with all of the necessary columns.

Normalization Pros:

  • no (or at least reduced) redundancy
  • faster writes if just 1 smaller table needs updating/inserting/deleting
  • ideal for transaction (OLTP) systems

Denormalization Pros:

  • easier for nontechnical users to understand
  • faster reads if all or most of the data is needed anyway
  • ideal for reporting (OLAP) systems

This is just a high-level overview. If you would like more information, I suggest searching the internet using some of the keywords mentioned here.

more ▼

answered Oct 19 '09 at 06:05 PM

Tom Staab gravatar image

Tom Staab
5.8k 6 8 10

(comments are locked)
10|1200 characters needed characters left

You may increase performance and reduce complexity in some cases with denormalization in a datawarehouse because you can reduce amount of joined tables, but as you are storing redundant data you will increase the size of the table and indexes you will add more IO, and that will in the end slow you down. Sometimes it is more clever to break the code into smaller pices with temp tables etc.

In a OLTP database it would normaly be a bad idea to denormalize due to the cost of data maintenance, and you will loose more than you will benefit from denormalization.

In a normalized database you try to break redundant data into separate table, for example you would probably have a country table with a countrycode and every table that need a country will have the country code instead of the full name. If you don't, what will happen if the country name will change? You will need to update every row in every table, and that is not an efficient solution. Yes, country names doesn't change that often, but it doest.

read more about denormalization in :

http://en.wikipedia.org/wiki/Denormalization

more ▼

answered Oct 19 '09 at 06:07 PM

Håkan Winther gravatar image

Håkan Winther
15.5k 33 37 48

Not necessarily more IO. Your storing of duplicate data could include summarizing some of that data, pre-calculating, so it could reduce I/O and CPU resources.
Oct 22 '09 at 07:50 PM Steve Jones - Editor ♦♦
I agree, and it is common to include summaries in Datamarts. I made it simple and tried to make people aware of possible side effects, but as you say it depends on what you denormalize. Every DBA:s favourite answer "it depends" :)
Oct 23 '09 at 03:44 AM Håkan Winther
(comments are locked)
10|1200 characters needed characters left

Denormalization means the deliberate creation of extra join dependencies that aren't implied by the superkeys of a relation. This is generally undesirable because it introduces redundancy to the database which could cause incorrect results and ambiguity.

Normalization/denormalization has nothing to do with performance because it specifies only the logical characteristics of the database, not their physical implementation. However, given the poor degree of Physical Data Independence in many DBMSs it is sometimes necessary to compromise your logical model in order to make corresponding changes to the underlying physical implementation. Another reason to make such compromises is if you need to implement some constraint or other logic that the DBMS cannot otherwise support.

more ▼

answered Oct 19 '09 at 06:13 PM

David 1 gravatar image

David 1
1.8k 1 3

Just to reiterate, in a (currently non-existant) RDBMS that perfectly separated the physical and logical implementations of the database, Normalization/Denormalization would have no performance implications. In all currently existing RDBMS systems the performance implications can be enormous.

You can partially mimic those benefits, at least for read operations, through the use of indexed views, but that does come with some tradeoffs. http://www.sqlservercentral.com/articles/Indexed+Views/63963/ is one place that discusses this to a degree.
Oct 22 '09 at 09:49 PM TimothyAWiseman
(comments are locked)
10|1200 characters needed characters left
Your answer
toggle preview:

Up to 2 attachments (including images) can be used with a maximum of 524.3 kB each and 1.0 MB total.

New code box

There's a new way to format code on the site - the red speech bubble logo will automatically format T-SQL for you. The original code box is still there for XML, etc. More details here.

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

SQL Server Central

Need long-form SQL discussion? SQLserverCentral.com is the place.

Topics:

x107
x10

asked: Oct 19 '09 at 05:14 PM

Seen: 13276 times

Last Updated: Oct 19 '09 at 05:14 PM