The ANTI JOIN – all values from table1 where not in table2
One of the less intuitive concepts I come across regularly in SQL is that of the ANTI JOIN. That is, requesting data from a table where some value is not in another table.
The wrong way of doing it is this:
SELECT * FROM table1 t1 WHERE t1.id NOT IN (SELECT id FROM table2)
This is very slow and inefficient, because the RDBMS has to compare each row with an entire table of values. The following is much much faster, as it matches up each row once, then simply eliminates the results that are not in the second table:
SELECT * FROM table1 t1 LEFT JOIN table2 t2 ON t1.id = t2.id WHERE t2.id IS NULL
This is what is known as an ANTI JOIN.. And I really wish it were part of standard SQL.. Be careful, though,if table2.id has any NULL rows, you will experience unexpected results. Here’s my proposal, and the following should throw a fatal error if table2.id is not a NOT NULL column:
SELECT * FROM table1 t1 ANTI JOIN table2 t2 ON t1.id = t2.id
Too bad I don’t make the rules.