question

Peter001 avatar image
Peter001 asked

Query Performance tremendous slow... 4 hours and still executing

Hello Techie, I have two tables with a highly denormalized nature, Master and child table. records In table: 40 million For a given specific line there is some valid footnote, and I need to check in child table whether a child table contain only those footnote for the line which is present in Master. If not than that record need to be in output **i tried to do like given below** but for small resultset this is giving output correctly, but for table have million of records even after 4 hours query is still executing. **may someone please share your expertise.or some other way to achieve this** Table and data DECLARE @MASTER TABLE ( Footnote VARCHAR (50), Line VARCHAR (50) ) INSERT @MASTER SELECT 'O031', 'EXHAUSTMUF' UNION ALL SELECT 'V049', 'CAMSHAFT' UNION ALL SELECT 'V049', 'DSHBRDCOVR' UNION ALL SELECT 'V049', 'EXHAUSTMUF' UNION ALL SELECT 'O040', 'EXHAUSTMUF' UNION ALL SELECT 'V133', 'DSHBRDCOVR' UNION ALL SELECT 'V133', 'EXHAUSTMUF' UNION ALL SELECT 'E014', 'CAMSHAFT' UNION ALL SELECT 'E014', 'EXHAUSTMUF' UNION ALL SELECT 'O062', 'EXHAUSTMUF' UNION ALL SELECT 'O194', 'EXHAUSTMUF' DECLARE @ChildTable TABLE ( line VARCHAR (50), footnote_list VARCHAR (50) ) INSERT @ChildTable select 'EXHAUSTMUF', 'O031, V049' UNION ALL select 'EXHAUSTMUF' ,'O040, V133' UNION ALL select 'EXHAUSTMUF', 'E014, O062, O194' UNION ALL select 'DSHBRDCOVR', 'USP2,LTS9' UNION ALL select 'DSHBRDCOVR', 'V049' Output **DSHBRDCOVR USP2,LTS9** I tried to do something like this. First added one function: create FUNCTION [dbo].[fnSplitString]( @string NVARCHAR(MAX), @delimiter CHAR(1), @keylist nvarchar(50)) RETURNS @output TABLE(splitdata NVARCHAR(MAX),keyName nvarchar(50)) BEGIN DECLARE @start INT, @end INT SELECT @start = 1, @end = CHARINDEX(@delimiter, @string) WHILE @start < LEN(@string) + 1 BEGIN IF @end = 0 SET @end = LEN(@string) + 1 INSERT INTO @output (splitdata,keyName) VALUES(ltrim(rtrim( SUBSTRING(@string, @start, @end - @start))),@keylist) SET @start = @end + 1 SET @end = CHARINDEX(@delimiter, @string, @start) END RETURN END --storing splitted values DECLARE @SplittedTable TABLE ( splitted VARCHAR (50), keys VARCHAR (50) ) --getting max rows declare @maxRows int =(SELECT count(*) FROM @childtable) --loop over each rows WHILE(@maxRows!=0) BEGIN DECLARE @footNotList nvarchar(max)=( SELECT abc.footnote_list FROM (SELECT ct.footnote_list, ROW_NUMBER() OVER (ORDER BY ct.footnote_list DESC) AS RowNumber FROM @ChildTable AS ct ) abc WHERE abc.RowNumber=@maxRows ) --inserting into splitted table INSERT INTO @SplittedTable SELECT * FROM dbo.fnSplitString(@footNotList,',',@footNotList) set @maxRows=@maxRows-1; END --selecting by keys and finding them into master SELECT * FROM @ChildTable ct WHERE ct.footnote_list IN ( SELECT DISTINCT st.keys FROM @SplittedTable AS st WHERE st.splitted NOT IN (SELECT Footnote FROM @MASTER m )) Thanks
sql-server-2008t-sqldelimited-string
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Magnus Ahlkvist avatar image
Magnus Ahlkvist answered
EDIT Further shortened the script and got rid of storing intermediate results in the Splitted table. See bottom of this post END EDIT There are two things to improve performance in this: 1) Use CROSS APPLY to call the function. That way, you won't need to loop over the rows in your table, instead you let SQL Server internally do the looping for you, applying the function call to each row. That's a major performance improvement. 2) Rewrite the split function to use a Tally- or Numbers-table, and change it to an inline Table Valued Function (they perform sooo much better than multi statement table valued functions when applied to a dataset). This is what I have done with the function (there are much better implementations of a Tally-table split, search on SQL Server Central for Jeff Moden's implementation of DelimitedSplit implementations). CREATE FUNCTION [dbo].[fnSplitString]( @string NVARCHAR(MAX), @delimiter CHAR(1), @keylist nvarchar(50)) RETURNS TABLE AS RETURN( WITH CTE AS( SELECT 1 as n UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 ),CTE2 AS( SELECT ROW_NUMBER() OVER(ORDER BY c.n) as n FROM CTE c CROSS JOIN CTE c2 CROSS JOIN CTE c3 CROSS JOIN CTE c4 CROSS JOIN CTE c5 CROSS JOIN CTE c6 ) SELECT TOP(LEN(@string)) splitdata = ltrim(rtrim( SUBSTRING(@string, n, CHARINDEX(@delimiter, @string + @delimiter, n)-n) )), keyname= @keylist FROM CTE2 WHERE n
5 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Peter001 avatar image Peter001 commented ·
Hi Magnus Thanks a lot for superb idea... but i am getting an error like "TOP clause contains an invalid value."
0 Likes 0 ·
Magnus Ahlkvist avatar image Magnus Ahlkvist commented ·
The function won't work for empty string and not for NULL values either. I think you can change the TOP clause to: TOP(LEN(COALESCE(@string),'')+1) Not sure how that would handle NULL values and I don't have a SQL Server around just now. But just try SELECT * FROM dbo.fmSplitString(NULL,',', NULL) and see if you get anything back. (efter changing the TOP-CLAUSE that is...
0 Likes 0 ·
Magnus Ahlkvist avatar image Magnus Ahlkvist commented ·
Now I've tested it. TOP(0) actually works. BUT TOP(LEN(NULL)) won't work - it will evaluate to TOP(NULL) which is invalid. So you just need to change the TOP-clause to: TOP(LEN(COALESCE(@string,''))) Forget about my previous comment.
0 Likes 0 ·
Peter001 avatar image Peter001 commented ·
Thanks a lot Magnus....... :)
0 Likes 0 ·
Magnus Ahlkvist avatar image Magnus Ahlkvist commented ·
Glad I could help :) Just out of curiousity - how much were you able to speed up query execution?
0 Likes 0 ·
David Wimbush avatar image
David Wimbush answered
I can't think of a much quicker way to do this with a table design that is so not designed for looking at this way. The only excuse for storing multiple values in one column like that is if that's the only way the applications need to see it and it avoids constantly stringing child rows together. It's forcing you to do a scalar function inside a cursor which is about as bad as it gets for performance. (OK, you could throw in some nested views to really kill it!) Is this a one-off query or will you have to do it regularly? If it's a one-off, they will just have to wait for it to run. If that badly affecta a production or reporting server, backup the database to a test server and run it there. If you have to do this regularly you should try to get this normalised. With that many rows it's never going to be a 2 second query but you should be able to get much better than 4+ hours.
1 comment
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Peter001 avatar image Peter001 commented ·
Hi Magnus. Greetings!!.. i know its too late, as i faced this today only. the same logic i applied to other table have row count (19017962 row(s) affected). but unfortunately it took 6 hour and still executing. is there any other way to do it. kindly suggest. Thanks a lot
0 Likes 0 ·

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.