The question is, what is the difference between having a filter on the ON clause, or in the WHERE clause?
Consider the following queries:
USE TempDB GO CREATE TABLE A (i int, name varchar(20)) INSERT INTO A (i, name) VALUES (1, 'Jack'), (2, 'Ryan') , (3, 'Simon'), (4, 'Sandra'), (5, 'Daryl') GO CREATE TABLE B (i int, amount decimal(9,3)) INSERT INTO B (i, amount) VALUES (1, 100.25), (2, 10000) , (1, 0.45), (4, 234.23) -- WHERE SELECT * FROM A INNER JOIN B ON A.i = B.i WHERE A.Name NOT LIKE 'S%' -- ON SELECT * FROM A INNER JOIN B ON A.i = B.i AND A.Name NOT LIKE 'S%'
Consider these two queries are the bottom. The first uses a “A.Name NOT LIKE ‘S%’†filter in the WHERE clause, whereas the second uses the same filter in the ON clause. The results are identical:
Where this gets interesting is when we use an OUTER JOIN. For example, consider the following two queries, which are identical to the previous two, but using a LEFT OUTER JOIN:
-- Query A - WHERE filter SELECT * FROM A LEFT JOIN B ON A.i = B.i WHERE A.Name NOT LIKE 'S%' -- Query B - ON filter SELECT * FROM A LEFT JOIN B ON A.i = B.i AND A.Name NOT LIKE 'S%'
And the results:
What’s happened here? The only difference was the placement of the A.Name clause, but the second query (the ON) has added Simon and Sandra’s rows.
What’s happened is due to the ordering of the logical query processing steps. Logically, all JOINs are performed as INNER JOINs using the ON filters, and then, as a subsequent step, any OUTER JOIN rows are added back to the necessary side. After all JOINs are complete is the WHERE filter processed.
This means that in Query A (WHERE), the inner join between A and B was completed, then rows 3 (Simon), and 5 (Daryl) were added back in. Then the WHERE was applied, and Simon and Sandra were removed as their name begins with S.
In Query B (ON), A and B were joined together, but any rows in A with a name LIKE ‘S%’ were not joined, as the ON filter ignored them during the initial JOIN phase. The OUTER JOIN phase then added in all rows from the LEFT that weren’t joined, and so Simon and Sandra were added back in.
Is this a problem? I don’t believe it’s much of an issue, as I tend to think of the WHERE clauses as being distinct from the ON clauses. Just remember to keep your join filters (ON) and your row excluders (WHERE) separate, and understand the difference when necessary.
If you ran the sample code, don’t forget to clean up:
DROP TABLE A DROP TABLE B
[...] [...]
Thanks for the great and simple explanation.
Much appreciated!
Clear and simple explanation between the difference.
Great work. I read a couple of threads in stackoverflow but could not get clearly get the difference.
The key here was mentioning that outer joins are performed after inners.
Thank you so much for the clarification!
The clearest explanation on google!