question

adam9367 avatar image
adam9367 asked

How can I replace a nested select with a join

I have a view that joins a table of user information with another table that hold photos for that user. Currently, I'm using a nested select to get that information but I would like to replace it with a join but I'm not sure how to do that. Any help would be greatly appreciated. Here is a snippet of what my current code looks like: SELECT A.UserID, A.UserName, (SELECT TOP 1 P.PhotoSort FROM UserPhotos P WHERE A.UserID = P.UserID AND PhotoSort >= 1 ORDER BY P.PhotoSort ASC) AS 'PhotoSort', (SELECT TOP 1 P.PhotoPath FROM UserPhotos P WHERE A.UserID = P.UserID AND PhotoSort >= 1 ORDER BY P.PhotoSort ASC) AS 'PhotoPath' FROM Users a
t-sqlselectjoins
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

1 Answer

·
Oleg avatar image
Oleg answered
<\!-- Begin Additional information I had to modify the queries somewhat once all the details were sorted out: - There are records in Users table which do not have any UserPhotos - It is possible for a single Users record to have multiple UserPhotos - The UserPhotos records of interest are those which have the SortOrder >= 1 - If the user has multiple records with sort >= 1, i.e. 1, 2, 3, 4 then the record with lowest positive sort should be picked This means that cross apply should be changed to outer apply and inner join to left join if the records from the Users which don't have any UserPhotos need to be included in the results. I tested both apply and join like this: - Created Users table clustered by UserID int and inserted 50,000 records there. - Created UserPhotos table clustered by UserID and made UserID a foreign key referencing respective UserID column of the Users table. This is a neat trick I learned from Grant Fritchey who said that such design is acceptable provided that the column values of the child table have a decent cardinality. - Inserted 5 UserPhotos records for each existing UserID totalling 250,000 records in the UserPhotos table - Deleted few UserPhotos records so I would have some Users records without any photos and some Users records with smallest SortOrder greater than 1. This allowed me to add some test figures and confirm that left join version beats the outer apply pretty handily as I suspected. The restated queries are below. End Additional Information \--> -- using cross apply select A.UserID, A.UserName, P.PhotoSort, P.PhotoPath from Users A outer apply ( select top 1 PhotoSort, PhotoPath from UserPhotos where UserID = A.UserID and PhotoSort >= 1 order by PhotoSort ) P; --using the join ;with records (UserID, UserName, num, PhotoSort, PhotoPath) as ( select A.UserID, A.UserName, row_number() over ( partition by A.UserID order by P.PhotoSort) num, P.PhotoSort, P.PhotoPath from Users A left join UserPhotos P on A.UserID = P.UserID and P.PhotoSort >= 1 ) select UserID, UserName, PhotoSort, PhotoPath from records where num = 1; You can use either query to get your results, but left join in my test happens to run considerably faster: Cross Apply results: execution time to select 50,000 records - 1.4 seconds. Stats: Table 'UserPhotos'. Scan count 50000, logical reads 151922, physical reads 203, read-ahead reads 0 Table 'Users'. Scan count 1, logical reads 441, physical reads 1, read-ahead reads 433 Left join results: execution time to select 50,000 records - 600 milliseconds. Stats: Table 'Users'. Scan count 3, logical reads 1318, physical reads 1, read-ahead reads 433 Table 'UserPhotos'. Scan count 3, logical reads 1772, physical reads 5, read-ahead reads 1582 Oleg
11 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Mark avatar image Mark commented ·
I had an attempt at an answer here, but deleted it since yours is much better Oleg.
1 Like 1 ·
Oleg avatar image Oleg commented ·
@adam9367 Do my queries work? They should if you have SQL Server 2005 or better with native compat 90 for the first and even smaller for the second query to run. If performance is a problem, please let me know a bit about your UserPhotos. How many records does it typically hold per single user? Does every user has a record in the UserPhotos? Do these records get physically deleted? How does the PhotoSort work? The reason I am asking is simply because if your UserPhotos table has multiple records for a single user, but the PhotoSort begins from number one and this is the record you prefer to select then the query in my answer can be restated with a join just like in Mark's answer except with one more join condition of PhotoSort = 1. Please let me know.
1 Like 1 ·
adam9367 avatar image adam9367 commented ·
@Oleg, your query works just fine. The users can upload many photos. I typically want the top photo regardless of the sort number. There are situations when the #1 photo could be missing while the #2 is intact. I'm new to sql and I was told that using nested selects in queries were not as efficient as joins so I was hunting for a more efficient solution. I could always add the photo fields to the user table but then if I want to add a spot for an additional photo I would have to change the schema, which I was hoping to avoid. I’m using a view so I don’t have to do multiple individual queries from my web app to get the top photo.
1 Like 1 ·
Oleg avatar image Oleg commented ·
@adam9367 What is top photo? If there are
situations when the #1 photo could be missing while the #2 is intact
then P.PhotoSort >= 1 predicate is not good, there has to be something else. You are absolutely correct that placing the PhotoPath into Users table is a bad idea. The real question is what does PhotoSort mean? Currently, if photo #1 could be missing while photo #2 is intact, the query will still return photo #1 unless I don't have a correct understanding of PhotoSort. Could you please let me know? As far as the queries are concerned, I am almost positive that the second query (with join) should be a better performer when compared to the first one (with cross apply), but it really depends on how your data is indexed and number of UserPhotos records per user. YOu can test both and pick the best.
1 Like 1 ·
Oleg avatar image Oleg commented ·
@Mark, I saw your answer and I think that you should not have deleted it because if my assumption is wrong and there is only one record in UserPhotos per user then your answer would be the best solution because in this case none of the nonsense in my queries would be necessary. If there are multiple UserPhotos records per user then there is no way out, but to have some logic to reduce the results to only one record per user, like I did.
0 Likes 0 ·
adam9367 avatar image adam9367 commented ·
Hi Mark and Oleg, thank you both for your post. Oleg, you are correct with your assumption. There are multiple photos per user. I was really hoping I wouldn't need to do a nested select because my user table holds over 30,000 users. Is there a more efficient way to structure my tables to get away from this problem?
0 Likes 0 ·
adam9367 avatar image adam9367 commented ·
@Oleg, we also use the photo table to hold non-picture images. For example, a photo sort of -1 would hold a image of a certificate, -2 would be other images of documentation and so on. I could always seperate that info into another table if using "top" is a better solution than ">= 1"
0 Likes 0 ·
Oleg avatar image Oleg commented ·
@adam9367 You don't have to. Please let me know a couple of scenarios, so I can figure whether it is possible to restate the second query (the one using join). - Do all users have images of interest (beginning with sort = 1)? Y / N - Is it possible for a user to have no record with sort = 1 but have record(s) with sort > 1? Y / N - If the user has multiple records with sort > 1, i.e. 1, 2, 3, 4 then which one should be picked by the select statement?
0 Likes 0 ·
adam9367 avatar image adam9367 commented ·
@Oleg • Do all users have images of interest (beginning with sort = 1)? No • Is it possible for a user to have no record with sort = 1 but have record(s) with sort > 1? Yes • If the user has multiple records with sort > 1, i.e. 1, 2, 3, 4 then which one should be picked by the select statement? Always the lowest positive sort.
0 Likes 0 ·
Oleg avatar image Oleg commented ·
@adam9367 Thank you. I had enough information to tweak the queries. I edited the answer and added some test results info. It appears that the join version works better.
0 Likes 0 ·
adam9367 avatar image adam9367 commented ·
@Oleg, thank you for taking the time to answer my question. I'm amazed by the thoroughness of your response. Do you do consulting?
0 Likes 0 ·

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.