|
I want to select all departments along with the name of any employees that happen to be in that department and who earn > 10000. I have written to queries that select the information I want: SELECT Departments.Name, Employees.Name, Employees.Salary FROM Departments LEFT OUTER JOIN Employees ON Employees.DepartmentId = Departments.Id AND Employees.Salary > 10000; SELECT Departments.Name, Employees.Name, Employees.Salary FROM Departments LEFT OUTER JOIN ( SELECT * FROM Employees WHERE Salary > 10000 ) AS Employees ON Employees.DepartmentId = Departments.Id; The actual execution plan for both queries is identical. So in this example I can use either query. My question is whether this will be true for any query. Does SQL Server treat additional criteria in join clauses as if we had created a derived table? I ask as I am writing code to generate SQL from a graphical design tool and am wondering which style to use. I welcome any thoughts or suggestions you might have.
(comments are locked)
|
|
The answer is, it depends. The criteria you're showing here are simple enough and the joins are straight forward, so they're operating in the same fashion. However, outer joins like you're defining with filtering criteria in the ON clause basically function like inner joins. Try changing the query to put the filter into the WHERE clause and see if the data changes. In general, I would not use derived tables if I can define the joins without them. I would only use them when I can't define the criteria any other way. +1 - spot on answer, and includes 'it depends'. I'd +2 if I could :)
Jan 05 '10 at 11:03 AM
Matt Whitfield ♦♦
I'll add the second +1 since Matt can't ;-) In most testing I have done with fairly simple queries like this the query plans will be the same if the queries are equivalent. With truly complicated queries that could easily change. Unless there is a real performance difference, I try to focus on readability and maintainability in writing queries. Even for ad hoc queries this makes it less likely I made a mistake, and for long lived code it can be vital. Avoiding a derived table when possible normally makes it easier to read, and using a CTE can help readability when you must use them.
Jan 05 '10 at 11:26 AM
TimothyAWiseman
+1 for the answer, but if he would put the filter in the WHERE clause, as you suggest, he will only get departments that does have at least one employee with more than 10000 in salary (sounds like INNER JOIN to me). To get all departments, and only employees with more than 10000 in salary he has to filter in the ON clause.
Jan 05 '10 at 12:18 PM
Håkan Winther
The WHERE clause is evaluated after the JOIN clause, so Hakan is correct, you would lose the benefit of the LEFT JOIN filtering in the WHERE.
Jan 05 '10 at 04:41 PM
Blackhawk-17
WHERE is not always evaluated after JOIN. The optimizer is smart enough to put filtering where it thinks it will work best. That's why you can see identical execution plans. And filtering in the WHERE clause and the ON clause only changes the data returned when the criteria are on the opposite side of the JOIN as Jack outlined below. Otherwise you don't see any difference in the execution plans or the data returned. When execution plans get really complex, moving things from the WHERE to the ON clause can have serious impact on the plan created, but only because the optimizer times out.
Jan 05 '10 at 05:30 PM
Grant Fritchey ♦♦
(comments are locked)
|
|
Posting an answer instead of a comment because the comment wouldn't let me put in the code I wanted to include. Grant, Doesn't adding the Criteria on the RIGHT table in the WHERE clause convert it a LEFT OUTER JOIN to an INNER JOIN while adding it to the ON clause KEEPS THE OUTER JOIN? Try this example:
The Results and the Execution plan shows the first query with the criteria in the ON clause stays a LEFT OUTER JOIN while the second query converts to an INNER JOIN. Other than that I agree with Grant, especially the part about avoiding derived tables when you can. +1 - you're absolutely right. I need more coffee today!
Jan 05 '10 at 12:30 PM
Matt Whitfield ♦♦
+1 I need more coffee today too. I wasn't tracking properly.
Jan 05 '10 at 12:33 PM
Grant Fritchey ♦♦
+1 - that was exactly my point with my comment to Grant
Jan 05 '10 at 12:48 PM
Håkan Winther
You can use the second query if you add "OR t.Col1 IS NULL" but that is a much worse mistake from a performance perspective and from readability perspective.
Jan 05 '10 at 12:53 PM
Håkan Winther
Whoa, wait a second, the WHERE clause doesn't convert queries to inner joins, that's a little backward. Instead, the query is filtering based on values from the right table where the matching criteria is incorrect. A better example would be to use 'Z' as the filtering mechanism. Then you'll notice that you're not getting values filtered at all in the join. In other words, the same data comes back if 'Z' is in the ON clause or not.
Jan 05 '10 at 05:41 PM
Grant Fritchey ♦♦
(comments are locked)
|

