I find the best way of explaining this is in terms of the execution plan.
If you look at the operator that is used to combine the two sets for a UNION ALL, you'll see it's a Concatenation operator, combining two sets of data without any further consideration of whether rows have already been seen, etc.
For a UNION, you can see a number of operations in play. You may see a Distinct Sort as it makes sure it doesn't return the same row twice. You may see a Full Outer Join, performed as either a Merge Join or a Hash Match. Or if each set involved in the UNION can be seen by the system as having no duplicates and no matches with the other set, you may find that it's performed in exactly the same way as a UNION ALL.
Almost every time you ask yourself what's different between two ways of writing a query, look at the differences in the execution plan - and remember that the differences you see are actually down to other aspects of the data.
Which one is better? Generally UNION ALL, but you need to focus on what your query needs to do and make sure you're asking the system the right question.
answered May 02, 2010 at 12:48 AM
UNION is for joining 2 result sets, JOIN is for joining 2 tables, views, table-valued functions, table expressions and the like.
UNION ALL simply joins 2 result sets including all the rows. UNION joins 2 result sets and removes duplicate rows. UNION by itself incurs the overhead of a merge join so that it can remove these dupes. It also has a side affect of ordering the rows. If you do not care about dupes UNION ALL is more performant
Queue's answer covers the difference between UNION and UNION ALL well.
You should also be aware of EXCEPT and INTERSECT which are two further horizontal concatenation operators.
In terms of the difference between JOIN and UNION - think of one in terms of vertical concatenation, and think of the other in terms of horizontal concatenation. Each of them can return less rows than were present in the source (with the exception that join can repeat on the one side when there are multiples on the other), and each of them take the dimensions of your result set and extend them.
That's how I usually explain them, anyhow.
answered May 01, 2010 at 05:23 PM
Matt Whitfield ♦♦
The main difference is UNION ALL will include all rows from the multiple recordsets being UNION'd together... Including any duplicates. BOL has a full description at http://msdn.microsoft.com/en-us/library/ms180026.aspx
answered May 01, 2010 at 04:18 PM
We use union and union all to combine result sets which return the same number of columns with same data types. And no, it is not same as full outer join.
Lets take an example with one column.
Now, when I write
That would give me 1,2,3,4,5,6 - So it has removed the duplicates.
If I do
It would return me 1,2,3,4,3,4,5,6 - It has got the duplicates with it.
Hope this help.
answered May 01, 2010 at 04:21 PM