question

DavidD avatar image
DavidD asked

overly normalized data and optimization

hi all, i am currently developing in a database of people. currently the records throughout the db are keyed as nchar types. the question i have is would my queries perform better (with all the joins) if the key fields were all int fields? thanks
performanceindexesoptimization
1 comment
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

DavidD avatar image DavidD commented ·
i should also mention i was using indexed views at one stage to improve performance, but i had to remove them due to the way data was pushed into the db in mass. the merge process was hanging and we believed this was a result of the server having to work to hard to rebuild the indexes on insert (or update not sure which).
0 Likes 0 ·
TimothyAWiseman avatar image
TimothyAWiseman answered
First, yes ints do compare faster than large datatypes, however the difference is normally small, especially when the larger datatype and the dataset in question is not excessively large. In other words, there is a performance benefit to switching to the smallest datatype possible, but it is normally outweighed by the fact you are carrying extra data to permit you to do that. Next, have you considered using indexed views? They can increase the performance of complex joins dramatically as well as hide the underlying complexity of the tables from developers. There is some information on this at [ http://www.sqlservercentral.com/articles/Indexed+Views/63963/][1] . Finally, this is a matter of some debate, but I think it is hard to normalize too much. It is however possible to do it incorrectly and that will cause problems, but I generally recommend going to 3NF and for certain situations 4NF or 5NF all make perfect sense. [1]: http://www.sqlservercentral.com/articles/Indexed+Views/63963/
6 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

WilliamD avatar image WilliamD commented ·
@Aaron - Have you thought about droppong the view before loading and recreating it after. This would remove the performance hit during bulk load, but still give you the performance boost of the indexed view. +1 @TimothyAWiseman.
1 Like 1 ·
DavidD avatar image DavidD commented ·
thanks timothy, i did originally use indexed views however had to remove them. check my comments above. perhaps you can suggest to me why sql went through the roof when pumping new data in? i assumed it was due to sql having to write the physical index for the view while our import procedure was pulling in data... thoughts on how i can reimplement the indexed view without having it fall over when data is being imported?
0 Likes 0 ·
TimothyAWiseman avatar image TimothyAWiseman commented ·
@Aaron, for truly large inserts WilliamD makes a very good point. It often makes sense to drop indexes and even (in some specific cases) constraints and readd them afterwards. This generally only applies to very large inserts though. I have used indexed views extensively, and like all indexes they do cause a performance drop on write operations. Normally this is very small and acceptable though. If it is not you may want to look first and make sure that you are not indexing more than you need to (indexes with an extreme number of columns or extremely large columns will cause more of a performance problem then those on just small keys). Also, check your insert statements themselves to make sure they are written in an effecient fashion. Finally, while I always recommend hardware upgrades only as a last resort for performance problems, do make sure your hardware is up to the task of handling the demands you are making of it.
0 Likes 0 ·
TimothyAWiseman avatar image TimothyAWiseman commented ·
This article talks about improper data splitting. I found it after writing my original response: http://www.simple-talk.com/content/article.aspx?article=1117 Perhapse something like that is what you mean by overnormalization?
0 Likes 0 ·
DavidD avatar image DavidD commented ·
how large an insert is large? originally we the view had i think 12 joins, and wee were updating/inserting 800 records at a time into each using sql 2008's new merge function. at the time i thought it was such a small insert that it couldnt affect performance but removing it made such a difference that i conceded it must have been the cause. i will investigate dropping the index and recreating it and look at optimizing the merge proc. thanks again all, awesome feedback
0 Likes 0 ·
DavidD avatar image DavidD commented ·
and also on that.... do update commands that update one or two fields (in a table that is contained in a indexed view) get the performance hit from the indexed view? i would assume yes due to the updated data having to be physically written into the indexes page. how about an update that occurs that doesnt actually update the table i.e. the update executed (due to faulty logic in the where clause) but didnt actually change any values?
0 Likes 0 ·
Grant Fritchey avatar image
Grant Fritchey answered
Unless you're looking at an instance where the address field has been normalized (one table for numbers, another for streets, that sort of thing), it's pretty unlikely that you're looking at an "overly" normalized system. Normalization is a process of defining the necessary information to meet the business requirements. For example, email addresses. It's very easy to say that each person gets one email address, but then, what happens when a person has two. So, you split email off & put it in a seperate table. Then you find that multiple people share an email address. Now things get hard. You can support duplicate values in the email table, or, you need to look at building an interim table to support associating a given email with more than one person and keep the email addresses unique (as they probably should be). We've just gone from one table to three. Now toss in phone numbers and addresses and you're suddenly looking at 10 or more tables to bring back basic information on an individual. But, I wouldn't call that overly normalized. I'd say it's normalized enough to support the data. As to the question on integers being better choices, yes, they are, for the reasons @Timothy said. Queries will perform better on smaller data types. The general rule is move only the data you need to move when you need to move it and store it pretty much the same way. Only make the values big enough to store what's needed. It basically comes down to the more data you can get onto a page, and that's either an index page or a data page, the better your performance will be because SQL Server will need to access fewer pages to return the data needed.
2 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

DavidD avatar image DavidD commented ·
thanks grant. yes i agree, and im not suggeesting normalization is bad but in this case there are tables that for example has a person and a text item associated with the person (amongst several other association tables) and looking at the business logic in the splitting of tables i cant make any justification for the seperation. ie one person would only ever have one text item assigned to them, therefore i dont see the need to split them. nevertheless i take your point and will keep that in mind.
1 Like 1 ·
Grant Fritchey avatar image Grant Fritchey ♦♦ commented ·
Good point. It's not unheard of for people building databases to make mistakes.
1 Like 1 ·
Henrik Staun Poulsen avatar image
Henrik Staun Poulsen answered
Are your tables Clustered? (i.e. do they have a clustered index?) If no, you could try this. Are you maintaining your indexes regularly? If no, you could try this. Are you shrinking and growing your database more than once a month? If yes, don't.
1 comment
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

DavidD avatar image DavidD commented ·
yes, yes and definately not. the db as a whole performs quite well, its just the some of the major queries that need some work. thanks
0 Likes 0 ·

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.