Hi, if I run below query, I get 901 records: select * from Table T1 where COLUMN C1 in (select COLUMN C1 FROM Table T2 where Column C2 = 0) but if I run another query as below, I get 900 records: SELECT T1.* FROM Table T1 INNER JOIN Table T2 ON T2.Column C1 = T1.Column C1 WHERE T2.Column C2 = 0; I also tried to find out if there are any duplicates in T1 but there is none. Can someone please help me to get why the result is different. Thanks,
Your first query filters based on every C1 value in T2 where C2=0. Your second query has the additional restriction that UN_Node must match between the 2 tables. Check for a row in T2 where C2=0, but there is no matching T1 record based on UN_Node. WITH MatchingC1 AS ( SELECT T1.C1 FROM T1 INNER JOIN T2 ON T2.UN_Node = T1.UN_Node WHERE T2.C2 = 0 ) SELECT C1 FROM T2 WHERE C2 = 0 AND NOT EXISTS (SELECT 1 FROM MatchingC1 T1 WHERE T1.C1 = T2.C1) ; --- edit after additional information provided via comments ---
I would have expected the opposite of what you said happened. Duplicates in T2 would have resulted in duplicates in the INNER JOIN result, but that would not have happened with the IN approach. Therefore, the JOIN result would be greater than the IN result, but you said the opposite occurred. As a side note, since T2 is much larger than T1, the IN approach could be much less efficient (depending on indexes and other filters) as it requires a list of every entry in T2 to provide in that IN clause. With the JOIN approach, the engine could decide it's more efficient to first get the T1 data and then just check T2 records where the column values match.