question

Sharma avatar image
Sharma asked

SQL Query using max sub-query?

Hi, What the chances are of optimize given below SQL query? SELECT SUM(BPD.CID_TAX) AS TAX FROM COMP_INV_HD B , COMP_INV_DETAIL BPD , SKU_PURCHASE_TAX T , DISTINFO_MASTER DI WHERE BPD.CID_CIM_INV_REFR = B.CIH_INV_REFR AND BPD.CID_ITM_ID = T.SKU7_ID AND T.CUSTOMER_STATE_ID = DI.STATE_ID AND T.ACTIVE_FROM = ( SELECT MAX(G.ACTIVE_FROM) FROM SKU_PURCHASE_TAX G WHERE BPD.CID_ITM_ID = G.SKU7_ID AND G.ACTIVE_FROM sqlsql-server-2000
2 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Kev Riley avatar image Kev Riley ♦♦ commented ·
To optimize, you would need to see the query in context. You would need to baseline the performance and the execution plan, so that any amendments can be validated. Certainly we can offer different ways of writing the same query, but often the knowledge of how many rows are in tables, how do they relate, what does the data mean - all play a part.
3 Likes 3 ·
Sharma avatar image Sharma commented ·
Table rows are as given below- COMP_INV_DETAIL - 56410 SKU_PURCHASE_TAX - 116526 COMP_INV_HD-6548 DISTINFO_MASTER -1
0 Likes 0 ·
Grant Fritchey avatar image
Grant Fritchey answered
Take a read through this article I wrote on getting [versioned data][1]. It should give you a number of methods for tuning this up. Without seeing your indexes and the execution plan, it's hard to suggest precise methods for making this particular query work better. The one thing I will say is, TOP 1 with an ORDER BY consistently worked better than a MAX statement and, the optimizer even turned MAX statements into TOP 1 statements, so you may as well use that syntax anyway. [1]: http://www.simple-talk.com/sql/database-administration/sql-strategies-for-versioned-data/
5 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Grant Fritchey avatar image Grant Fritchey ♦♦ commented ·
I'll refer you back to the article. You don't have to use TOP or MAX, you can use ROW_NUMBER and that could sometimes out perform TOP. All of them used sub-queries. If you can restructure your tables, there is another method, outlined here, http://www.simple-talk.com/sql/sql-training/yet-another-sql-strategy-for-versioned-data/, that might get you a little better performance. In my tests it was either faster or slower, at different data sets. As to your final question, the difference is between ANSI 89 and ANSI 92 joins. As defined, those two queries, no difference. But more complex queries can have big differences. JOIN criteria, in general, should be in the JOIN clause, WHERE criteria, in general, should be in the WHERE clause.
1 Like 1 ·
Sharma avatar image Sharma commented ·
(1) Table data:- COMP_INV_DETAIL – 56410 SKU_PURCHASE_TAX - 116526 COMP_INV_HD-6548 DISTINFO_MASTER -1 (2) No Index available on tables. (3) 35 Minutes taking in query exec. (4) 1 Sec. taking in query exec. if I use TOP 1 with an ORDER BY. (5) If I created index on tables then with in 1 Sec. with MAX also. After seen all above points this is true that TOP 1 is consistently work better than a MAX statement and IF we create index then also query exec will be fast but I want to know any other option to write this query without sub-query ? Q.1 How I can write this query without sub-query? Q.2 Is this the only single approach available on SQL for fill mine requirement? Q.3 What is difference and performance impact in compare of given below two queries? select * from b, a where a.t=b.t Or select * from b inner join a on a.t=b.t
0 Likes 0 ·
Sharma avatar image Sharma commented ·
I have to make this query to work on SQL 2000 also and ROW_NUMBER does not support in SQL 2000. Is there will be any performance impact if we use ANSI 89 joins in compare of ANSI 92 JOINS.
0 Likes 0 ·
Grant Fritchey avatar image Grant Fritchey ♦♦ commented ·
There can be, yes. But not because they're treated differently by the optimizer. You're putting work on the optimizer to figure out the join criteria when you put everything in the WHERE clause. That added work can affect the execution plans that the optimizer is able to generate. That can affect your performance. Also, in 2008 and above, you can't use the ANSI 89 outer join syntax, *= or =*. It's been removed. You must use OUTER JOIN. Better to just write your code to the latest and most used standard than to go the other way.
0 Likes 0 ·
Blackhawk-17 avatar image Blackhawk-17 commented ·
@amardeep - why are there no indexes on the tables?
0 Likes 0 ·
Magnus Ahlkvist avatar image
Magnus Ahlkvist answered
My five cents: If you are not **absolutely** certain that you will always get a hit on the subquery, you could consider using COALESCE. Otherwise, if you get no results from the subquery, it would return NULL and the equals operator and NULL values don't work well together. For example: SELECT CASE WHEN null = null THEN 1 ELSE 0 END would give you a **0** as result when one might expect **1**. So I would consider: .. AND T.ACTIVE_FROM = COALESCE(( SELECT MAX(G.ACTIVE_FROM) FROM SKU_PURCHASE_TAX G WHERE BPD.CID_ITM_ID = G.SKU7_ID AND G.ACTIVE_FROM
3 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Sharma avatar image Sharma commented ·
It also work great but not like TOP command in WITH TOP command query execution time was 1 sec. and with COALESCE it took 3 Sec. Why COALESCE command is also reducing execution time of query? How COALESCE(A,-99) work with sub-query and where it can create issue which we can not get through max()?
0 Likes 0 ·
Magnus Ahlkvist avatar image Magnus Ahlkvist commented ·
COALESCE works regardless of if you use TOP or MAX. COALESCE is a function that takes a number of arguments, and returns the first argument which is not null. For example **COALESCE(null,1)** would return 1 while **COALESCE(0,1)** would return 0. When you put a sub-query as the first argument to COALESCE, it returns either the result of the subquery (if the subquery returns a non-null result), otherwise it returns the second argument. Using COALESCE causes SQL Server to make another function call, another comparison, and that might hurt performance a tiny bit, but it helps you produce predictable results, so that's a price I'm usually willing to pay.
0 Likes 0 ·
Blackhawk-17 avatar image Blackhawk-17 commented ·
As @Magnus Ahlkvist says, COALESCE is not necessarily an optimization for your query but is a safety measure to help get correct results and avoid issues with NULL. NULL is a tricky beast and unless you handle it, and understand it both in terms of its logic and your data, it can cause a lot of head scratching.
0 Likes 0 ·

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.