x

What should I do to improve this bottle neck?

Hello all, first of all, sorry for the non descriptive title, I'm just too rushed so I couldn't come up with a better one.
Second:
I have a portion of my database the looks like the following diagram:
You can an image of the data model here: "http://img682.imageshack.us/img682/1089/testdatamodel.png" as I'm not allowed to user image tag.
I have contributors on the system, each write to many sources, and a source can have many working contributors. Users can subscribe to as many contributors as they like and as many sources as they like. Now, what I want to do is simply retrieve all the articles for certain user. These articles are either coming through a contributor or a source the user subscribes to. To make it easy, when a user subscribes to a source I simply copy all the sources contributors to the users_contributors table. One tricky piece, when I retrieve the user's articles I retrieve all the articles that he his contributors write, and all the articles that were published in the sources he follows where those articles doesn't have a valid contributor on the system. (I.E contributorID is null).
I created the following query:

 Select Articles.ArticleID, Articles.ContributorId, Contributors.Name, 
    Sources.Name, Articles.ArticleTitle
From Articles 
    		Inner Join Contributors On Articles.ContributorId = Sources.SourceID
    		Inner Join Sources On Articles.SourceId = Sources.SourceID			
Where Articles.ContributorId in (
    Select ContributorId from Users_Contributors
    Where UserID = 3
    )
    OR (

    	Articles.SourceId in (
    		Select SourceId from Users_Sources
    		Where UserID = 3
    		)

    		and 
    		Articles.ContributorId is null
    )

The problem with the above query is that, it doesn't return any article with contributorID null. I understand this is because of the join on the contributors table. What should I do in such a case?
Should I consider denormalization?
What are the prober fields to index on each table for this query to run fast (Rowset returned are approximately 10000)?
I need to support paging on this query, will With { } clause be appropriate to me, or should I consider another strategy?
Thanks in advance.
Ps: I'm using SQL Server 2008

more ▼

asked Nov 22 '09 at 04:25 AM in Default

7alwagy gravatar image

7alwagy
34 3 3 3

(comments are locked)
10|1200 characters needed characters left

3 answers: sort voted first

To get articles with ContributorID = null, make the join to the Sources table a LEFT JOIN.

You can look at the query plan in 2008, and it will suggest indexes but here are some starting ideas.

Users_Contributors (UserID, ContributorID) -- if USERID is always going to be 3 -- consider a filtered index that only contains rows where USERID = 3.

Articles (UserID, SourceID, ContributorID) -- same comments about filtered index

Sources (SourceID, Name)

By the way, how is join at the bottom supposed to work? What column in the contributors table itself are you joining to? Assuming you meant Contributors.ContributorID instead of Sources.SourceID, then

Contributors (ContributorID, Name)

Never cut and past bits and pieces. You should always show the code that is actually running as a whole. A join to a table that references no column in that table makes no sense.

Inner Join CONTRIBUTORS On Articles.ContributorId = Sources.SourceID

more ▼

answered Nov 23 '09 at 10:58 AM

Bob Hovious gravatar image

Bob Hovious
1.6k 5 6 9

(comments are locked)
10|1200 characters needed characters left

In terms of the paging i would point you here.
Are you able to post the actual query plan for the query ?

more ▼

answered Nov 22 '09 at 07:50 AM

dave ballantyne gravatar image

dave ballantyne
928 1 1 4

(comments are locked)
10|1200 characters needed characters left

If you need all columns of ARTICLES matching or not CONTRIBUTORS, then use a LEFT JOIN instead of INNER JOIN. Use the same rule for ARTICLES if it applies.

The first performance tip i can give is to replace your IN() by EXISTS or even LEFT JOINS:

-- replacing IN() by LEFT JOINS
Select Articles.ArticleID, Articles.ContributorId, Contributors.Name, 
    Sources.Name, Articles.ArticleTitle
From Articles 
                LEFT JOIN Contributors On Articles.ContributorId = Sources.SourceID
                LEFT JOIN Sources On Articles.SourceId = Sources.SourceID                      
    			LEFT JOIN Users_Contributors On Articles.ContributorId = Users_Contributors.ContributorId AND Users_Contributors.UserID = 3
    			LEFT JOIN Users_Sources On Articles.SourceId = Users_Sources.SourceId AND Users_Sources.UserID = 3
Where   Users_Contributors.ContributorId IS NOT NULL
    	OR 
    	( Users_Sources.SourceId IS NOT NULL AND Articles.ContributorId is null )

-- replacing IN() by EXISTS
Select  Articles.ArticleID, Articles.ContributorId, Contributors.Name, 
    	Sources.Name, Articles.ArticleTitle
From    Articles 
                LEFT Join Contributors On Articles.ContributorId = Sources.SourceID
                LEFT Join Sources On Articles.SourceId = Sources.SourceID                      
Where EXISTS
    (
    Select top 1 1 from Users_Contributors
    Where UserID = 3 AND Articles.ContributorId = Users_Contributors.ContributorId
    )
    OR EXISTS ( Select top 1 1  from Users_Sources
                Where UserID = 3 AND Articles.SourceId = Users_Sources.SourceId
                )
                AND
                Articles.ContributorId is null
    )

Basic Indexes should be all ids with clustered index, nonclustered index on Users_Contributors( ContributorId, UserID) and Users_Sources( SourceId, UserID ).

For paging, i would suggest row_numer() function, its something like:

with aux(
    Select	row_number() over( order by <some order criteria> ) as id,
    .
    . <remaining select>
    .
)
select * from aux where ID between @start_value and @final_value

Let me know if this helps...

more ▼

answered Nov 23 '09 at 11:15 AM

Gustavo gravatar image

Gustavo
592 4 4 7

(comments are locked)
10|1200 characters needed characters left
Your answer
toggle preview:

Up to 2 attachments (including images) can be used with a maximum of 524.3 kB each and 1.0 MB total.

New code box

There's a new way to format code on the site - the red speech bubble logo will automatically format T-SQL for you. The original code box is still there for XML, etc. More details here.

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

SQL Server Central

Need long-form SQL discussion? SQLserverCentral.com is the place.

Topics:

x977
x242

asked: Nov 22 '09 at 04:25 AM

Seen: 998 times

Last Updated: Nov 22 '09 at 04:40 AM