question

marinehero avatar image
marinehero asked

Surely there must be a better way of doing this query?

So my table definition is as follows: CREATE TABLE [dbo].[MessagesTrending]( [TrendingDate] [datetime] NOT NULL, [TrendingPosition] [int] NOT NULL, [MessageID] [bigint] NOT NULL, [MessageCreated] [datetime] NOT NULL, [TimeZone] [nvarchar](50) NULL CONSTRAINT [DF_MessagesTrending_TimeZone] DEFAULT (getdate()), [LocalDate] [datetime] NULL CONSTRAINT [DF_MessagesTrending_LocalDate] DEFAULT (getdate()), [TotalViews] [int] NOT NULL CONSTRAINT [DF_MessagesTrending_TotalViews] DEFAULT ((0)), [TotalLikes] [int] NOT NULL CONSTRAINT [DF_MessagesTrending_TotalLikes] DEFAULT ((0)), [TotalFavorites] [int] NOT NULL CONSTRAINT [DF_MessagesTrending_TotalFavorites] DEFAULT ((0)), [TotalScore] [int] NOT NULL CONSTRAINT [DF_MessagesTrending_TotalScore] DEFAULT ((0)), [IsCurrent] [bit] NOT NULL CONSTRAINT [DF_MessagesTrending_IsCurrent] DEFAULT ((1)), [CategoryID] [int] NOT NULL CONSTRAINT [DF_MessagesTrending_Category] DEFAULT ((0)), [Saves] [decimal](18, 0) NOT NULL DEFAULT ((0)), [Impressions] [decimal](18, 0) NOT NULL DEFAULT ((0)), [Comments] [decimal](18, 0) NOT NULL DEFAULT ((0)), [TotalDwellTime] [decimal](18, 0) NOT NULL DEFAULT ((0)), [Photos] [decimal](18, 0) NOT NULL DEFAULT ((0)), [Age] [decimal](18, 0) NOT NULL DEFAULT ((0)), [NewScore] AS (((((power([Saves],(3))*(1000))/power([Impressions]+(1),(2))+(power([Comments],(3))*(50))/power([Impressions]+(1),(2)))+([TotalDwellTime]/([Impressions]+(1)))/[Photos])+[Photos]/(10))*((2)/log((0.1)*[Age]+(1.13))+(0.2))), CONSTRAINT [PK_MessagesTrending] PRIMARY KEY CLUSTERED ( [TrendingDate] ASC, [TrendingPosition] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] ) ON [PRIMARY] Now I have a periodic stored proc called UpdateTrending which executes lets say every hour The Key column the users will be using to query these trending messages is a Computed column **NewScore** **MY QUESTION** (Besides this query being horrendous) is: Why can't I use inline variables that I can reuse within the query for the computed column? example being: > (select count(FavoriteID) from > MessageFavorites MV where > MV.MessageID=PM.MessageID) as **Saves** ALTER PROCEDURE [dbo].[UpdateTrending] AS BEGIN SET NOCOUNT ON; DECLARE @TrendCursor CURSOR DECLARE @LastUpdated datetime SET @LastUpdated = NULL SET @TrendCursor = CURSOR FOR SELECT MAX(TrendingDate) AS TDate FROM dbo.MessagesTrending OPEN @TrendCursor FETCH NEXT FROM @TrendCursor INTO @LastUpdated CLOSE @TrendCursor DEALLOCATE @TrendCursor PRINT('@LastUpdated=' + Convert(varchar(20), @LastUpdated)) IF (@LastUpdated IS NULL) BEGIN SET @LastUpdated = DATEADD(hour, -1, GetDate()) END PRINT('DATEDIFF=' + Convert(varchar(20), DATEDIFF(hour, @LastUpdated, GETDATE()))) IF (DATEDIFF(hour, @LastUpdated, GETDATE()) >= 0) BEGIN BEGIN TRAN UpdateTransaction TRUNCATE TABLE MessagesTrending --SET IsCurrent = 0 WHERE IsCurrent = 1; SET @LastUpdated = GETDATE() INSERT INTO MessagesTrending SELECT @LastUpdated AS TrendingDate, ROW_NUMBER() OVER(Order by VXX.TotalScore DESC) AS TrendingPosition , VXX.MessageID, VXX.Created, VXX.TimeZone, VXX.LocalDate, VXX.[Views], VXX.Likes, VXX.Favorites, VXX.TotalScore, VXX.IsCurrent, VXX.CategoryID, VXX.Saves, VXX.Impressions, VXX.Comments, VXX.TotalDwellTime, VXX.Photos, VXX.Age FROM ( SELECT TOP (2000) MessageID , Created, TimeZone, LocalDate , Views, Likes, Favorites , ( (select count(CommentID) from MessageComments MC where MC.MessageID=PM.MessageID) -- comments + (select count(FavoriteID) from MessageFavorites MV where MV.MessageID=PM.MessageID) -- saves + ((select count(LikeID) from MessageLikes ML where ML.MessageID=PM.MessageID)/2)) -- likes / cast( (dbo.InlineMax(DATEDIFF(day,PM.Created,GETDATE())/1,1) ) as decimal) * 1000 as TotalScore , 1 AS IsCurrent , CategoryID , (select count(FavoriteID) from MessageFavorites MV where MV.MessageID=PM.MessageID) as Saves , (select count(ImpressionId) from Impressions IM where IM.PostID=PM.MessageID) as Impressions , (select count(CommentID) from MessageComments MC where MC.MessageID=PM.MessageID) as Comments , (select coalesce((select sum(DurationInMillis) from Impressions IM where IM.PostID = PM.MessageID),0)) as TotalDwellTime , MediaCount as Photos , cast( (dbo.InlineMax(DATEDIFF(day,PM.Created,GETDATE()),1) ) as decimal) as Age FROM dbo.PostedMessages PM where Deleted is NULL ORDER BY TotalScore DESC ) AS VXX IF @@ERROR <> 0 BEGIN PRINT('Rollback' + Convert(varchar(20), @@ERROR)) ROLLBACK END ELSE BEGIN PRINT('Commited' + Convert(varchar(20), @LastUpdated)) COMMIT END END END
stored-procedurestsqlsub-query
1 comment
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

KenJ avatar image KenJ commented ·
I would not recommend clustering the NewScore column. It seems likely to change nearly every time the row changes and is a recipe for heavy fragmentation. Perhaps add an identity column to the table and cluster that. You can still index the NewScore column for good user seek performance, but it won't fragment the whole table so quickly.
1 Like 1 ·
Tom Staab avatar image
Tom Staab answered
First, I'll answer your question about the inline variable. The only part of a SELECT statement that can reference aliases is the ORDER BY clause because it happens after those alias calculations have been completed. The only other way to do what you are suggesting would be to force one step before the other by using a CTE or something similar. On that note, I attempted to rewrite your query using CTEs because I think (based solely on your query and having no access to your actual database, of course) it will perform much better. Rather than calculate the aggregates individually for each row of the outer query, I changed it to pre-calculate the aggregates for the whole set and then apply those results to the main query. I made a few other minor changes and have some comments: 1. I prefer using the "alias = " syntax rather than "AS alias" because I find it easier to read. There is no benefit to either, so pick the one you prefer. 2. You had the result of InlineMax divided by 1 in your TotalScore calculation. I didn't see the point, so I removed it. 3. I changed GETDATE to SYSDATETIME because it's preferred for versions beyond 2005. 4. I always recommend prefixing every column with the table alias so you know where you got that value. For example, which table is referenced by "WHERE Deleted is null"? 5. I recommend specifying precision and scale when using the decimal data type. 6. I used CONVERT rather than CAST in my rewrite, but that's just a personal preference. 7. I don't know what the InlineMax function does, but it's best to avoid scalar functions when possible. Having said all of that, here's my version. Please let us know if you have further questions. I hope this helps. WITH AggComments AS ( SELECT MessageID, COUNT(*) AS TheCount FROM MessageComments GROUP BY MessageID ) , AggFavorites AS ( SELECT MessageID, COUNT(*) AS TheCount FROM MessageFavorites GROUP BY MessageID ) , AggLikes AS ( SELECT MessageID, COUNT(*) AS TheCount FROM MessageLikes GROUP BY MessageID ) , AggImpressions AS ( SELECT PostID , TheCount = COUNT(*) , Duration = SUM(DurationInMillis) FROM Impressions GROUP BY PostID ) SELECT @LastUpdated AS TrendingDate , ROW_NUMBER() OVER(Order by VXX.TotalScore DESC) AS TrendingPosition , VXX.MessageID, VXX.Created, VXX.TimeZone, VXX.LocalDate, VXX.[Views], VXX.Likes, VXX.Favorites, VXX.TotalScore, VXX.IsCurrent, VXX.CategoryID , VXX.Saves, VXX.Impressions, VXX.Comments, VXX.TotalDwellTime, VXX.Photos, VXX.Age FROM ( SELECT TOP (2000) MessageID , Created, TimeZone, LocalDate , Views, Likes, Favorites , TotalScore = (ISNULL(MC.TheCount, 0) + ISNULL(MV.TheCount, 0) + ISNULL((ML.TheCount / 2), 0) * 1000 / CONVERT(decimal, IIF(DATEDIFF(day, PM.Created, SYSDATETIME()) > 1, DATEDIFF(day, PM.Created, SYSDATETIME()), 1)) , IsCurrent = 1 , CategoryID , Saves = ISNULL(MV.TheCount, 0) , Impressions = ISNULL(IM.TheCount, 0) , Comments = ISNULL(MC.TheCount, 0) , TotalDwellTime = ISNULL(IM.Duration, 0) , Photos = MediaCount , Age = CONVERT(decimal, IIF(DATEDIFF(day, PM.Created, SYSDATETIME()) > 1, DATEDIFF(day, PM.Created, SYSDATETIME()), 1)) FROM dbo.PostedMessages PM LEFT JOIN AggComments MC ON MC.MessageID = PM.MessageID LEFT JOIN AggFavorites MV ON MV.MessageID = PM.MessageID LEFT JOIN AggLikes ML ON ML.MessageID = PM.MessageID LEFT JOIN AggImpressions IM ON IM.PostID = PM.MessageID WHERE Deleted is NULL ORDER BY TotalScore DESC ) AS VXX ;
2 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Tom Staab avatar image Tom Staab ♦ commented ·
I fixed the null issue by adding some ISNULL calls when referencing values from the tables accessed via LEFT JOIN. I also replaced your scalar function call with equivalent code in this situation.
0 Likes 0 ·
Tom Staab avatar image Tom Staab ♦ commented ·
I forgot to mention another thing. I don't see any reason for the cursor at the beginning of your procedure. The following code accomplishes the same thing: SET @LastUpdated = ( SELECT MAX(TrendingDate) FROM dbo.MessagesTrending );
0 Likes 0 ·
marinehero avatar image
marinehero answered
@Tom, Very good points and advice ! I like your approach... fwiw: tried your solution but am getting a TotalScore of NULL :( PS: The Column I really need to rank by is the Computed **NewScore** fyi: the scalar inline function was simply to get the max of 2 vars (basically avoiding a divide by zero) ALTER function [dbo].[InlineMax](@val1 int, @val2 int) returns int as begin if @val1 > @val2 return @val1 return isnull(@val2,@val1) end
2 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Tom Staab avatar image Tom Staab ♦ commented ·
What version of SQL Server are you using?
0 Likes 0 ·
marinehero avatar image marinehero commented ·
Microsoft SQL Server 2012 - 11.0.2100.60 (X64) Feb 10 2012 19:39:15 Copyright (c) Microsoft Corporation Standard Edition (64-bit) on Windows NT 6.1 (Build 7601: Service Pack 1) (Hypervisor) Running on Amazon RDS
0 Likes 0 ·

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.