question

sari avatar image
sari asked

What is the difference between UNION and UNION ALL?

What is the difference between union and union all in sql server? Is this related to joins(I can specify full join). Thank you all.

sql-server-2005unionunion-all
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Rob Farley avatar image
Rob Farley answered

I find the best way of explaining this is in terms of the execution plan.

If you look at the operator that is used to combine the two sets for a UNION ALL, you'll see it's a Concatenation operator, combining two sets of data without any further consideration of whether rows have already been seen, etc.

For a UNION, you can see a number of operations in play. You may see a Distinct Sort as it makes sure it doesn't return the same row twice. You may see a Full Outer Join, performed as either a Merge Join or a Hash Match. Or if each set involved in the UNION can be seen by the system as having no duplicates and no matches with the other set, you may find that it's performed in exactly the same way as a UNION ALL.

Almost every time you ask yourself what's different between two ways of writing a query, look at the differences in the execution plan - and remember that the differences you see are actually down to other aspects of the data.

Which one is better? Generally UNION ALL, but you need to focus on what your query needs to do and make sure you're asking the system the right question.

10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

John Franklin avatar image
John Franklin answered

The main difference is UNION ALL will include all rows from the multiple recordsets being UNION'd together... Including any duplicates. BOL has a full description at http://msdn.microsoft.com/en-us/library/ms180026.aspx

1 comment
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Jeff Moden avatar image Jeff Moden commented ·
Nice and simple is good. +1
0 Likes 0 ·
Queue avatar image
Queue answered

We use union and union all to combine result sets which return the same number of columns with same data types. And no, it is not same as full outer join.

Lets take an example with one column.

Table1 
Col1
1
2
3
4

Table2
Col2
3
4
5
6

Now, when I write

SELECT Col1 FROM Table1
UNION 
SELECT Col2 FROM Table2

That would give me 1,2,3,4,5,6 - So it has removed the duplicates.

If I do

    SELECT Col1 FROM Table1
    UNION ALL
    SELECT Col2 FROM Table2

It would return me 1,2,3,4,3,4,5,6 - It has got the duplicates with it.

Hope this help.

3 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

TimothyAWiseman avatar image TimothyAWiseman commented ·
Remove the duplicates incurs overhead, so union all is generally more efficient.
0 Likes 0 ·
SQL Kiwi avatar image SQL Kiwi commented ·
I think it is important to mention that UNION removes duplicates from each *input* as well as from the output - so if Table1 contained ((1),(1),(1)) and Table2 contained ((2),(4),(4)) the output would be ((1),(2),(4)).
0 Likes 0 ·
sari avatar image sari commented ·
hi Queue,can you give me simple example such that both columns of the table are varchar datatype(string)
0 Likes 0 ·
Scot Hauder avatar image
Scot Hauder answered

UNION is for joining 2 result sets, JOIN is for joining 2 tables, views, table-valued functions, table expressions and the like.

UNION ALL simply joins 2 result sets including all the rows. UNION joins 2 result sets and removes duplicate rows. UNION by itself incurs the overhead of a merge join so that it can remove these dupes. It also has a side affect of ordering the rows. If you do not care about dupes UNION ALL is more performant

4 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

TimothyAWiseman avatar image TimothyAWiseman commented ·
I think you have a typo. UNION removes duplicates. UNION ALL does *not* remove duplicates.
0 Likes 0 ·
Scot Hauder avatar image Scot Hauder commented ·
exactly, thx
0 Likes 0 ·
CirqueDeSQLeil avatar image CirqueDeSQLeil commented ·
I think a key element is the final sentence - if you don't care about dups, then union all. Otherwise use the union.
0 Likes 0 ·
sari avatar image sari commented ·
Thank you all for nice explanation.can anybody give me simple example on this.consider both columns of the tables are of varchar datatype.in this case i want to how dupilcates removed in string concatenation.
0 Likes 0 ·
Matt Whitfield avatar image
Matt Whitfield answered

Queue's answer covers the difference between UNION and UNION ALL well.

You should also be aware of EXCEPT and INTERSECT which are two further horizontal concatenation operators.

In terms of the difference between JOIN and UNION - think of one in terms of vertical concatenation, and think of the other in terms of horizontal concatenation. Each of them can return less rows than were present in the source (with the exception that join can repeat on the one side when there are multiples on the other), and each of them take the dimensions of your result set and extend them.

That's how I usually explain them, anyhow.

1 comment
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

sari avatar image sari commented ·
Thank you sir.helpful to me a lot.
0 Likes 0 ·
SQL Kiwi avatar image
SQL Kiwi answered

To illustrate the many plan possibilities for UNION and UNION ALL:

-- UNION implemented using a Concatenation  
SELECT data FROM (VALUES (1), (2)) A (data)  
UNION  
SELECT data FROM (VALUES (3), (4)) B (data);

CREATE TABLE #A (data INT NOT NULL);  
CREATE TABLE #B (data INT NOT NULL);

INSERT  #A VALUES (1),(1), (3), (2), (2);  
INSERT  #B VALUES (1),(2), (2), (1), (4);

-- Concatenate then Distinct Sort on the result
SELECT data FROM #A  
UNION  
SELECT data FROM #B OPTION (CONCAT UNION, ORDER GROUP);

-- Concatenate then Hash Distinct on the result  
SELECT data FROM #A  
UNION  
SELECT data FROM #B OPTION (CONCAT UNION, HASH GROUP);

-- Sort Distinct on the inputs, then Merge Union
SELECT data FROM #A 
UNION 
SELECT data FROM #B OPTION (MERGE UNION, ORDER GROUP);

-- Hash Distinct and Sort on the inputs to a Merge Union
SELECT data FROM #A 
UNION 
SELECT data FROM #B OPTION (MERGE UNION, HASH GROUP);

-- Sort Distinct on the probe input to a Hash Union
SELECT data FROM #A 
UNION 
SELECT data FROM #B OPTION (HASH UNION, ORDER GROUP);

-- Hash Distinct on the probe input to a Hash Union
SELECT data FROM #A 
UNION 
SELECT data FROM #B OPTION (HASH UNION, HASH GROUP);

-- UNION ALL performed by concatenation
SELECT data FROM #A 
UNION ALL 
SELECT data FROM #B;

-- Add indexes to provide sorted input (optimiser now has more options)
CREATE INDEX nc1 ON #A (data);
CREATE INDEX nc1 ON #B (data);

-- UNION ALL performed by Merge Concatenation
SELECT data FROM #A 
UNION ALL 
SELECT data FROM #B ORDER BY data;

-- Stream Aggregate Distinct on the inputs to a Merge Union
SELECT data FROM #A 
UNION 
SELECT data FROM #B;

-- Stream Aggregate Distinct on the inputs to a Merge Concatenation, then another Stream Aggregate Distinct
SELECT data FROM #A 
UNION 
SELECT data FROM #B OPTION (CONCAT UNION, ORDER GROUP);

DROP TABLE #A, #B;
1 comment
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

SQL Kiwi avatar image SQL Kiwi commented ·
Thanks for fixing the formatting Fatherjack - though I am not sure exactly what you did, or what I should be doing differently...it looks the same to me now as before. Help?
0 Likes 0 ·
David 1 avatar image
David 1 answered

SQL is not a truly relational language and one of its non-relational features is that SQL tables and queries can contain duplicate rows. The need to support duplicates complicates the language and its implementations very significantly. This is arguably SQL's most serious flaw (at least that's what Ted Codd thought).

SQL supports ALL and DISTINCT versions of some operations. The ALL keyword means duplicate rows are included while the DISTINCT keyword eliminates duplicates. To make matters worse, although both keywords are optional the behaviour when neither one is specified is not consistent. In the SELECT clause ALL is the default but with the UNION operator DISTINCT is the default.

10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Nadir.Ramani avatar image
Nadir.Ramani answered
Union return unique result set, whereas Union all simply merge data from different source without performing unique on the result set... you can get more info [here][1] [1]: http://knowitbasic.blogspot.com/2013/11/difference-between-union-and-union-all.html
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Hussainpatel avatar image
Hussainpatel answered
The difference between UNION and UNION ALL is UNION ALL ---- will take all the records between two queries without elimenating the duplicate records UNION------ will retrieve the records with eliminating the duplicates
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.