UtterAccess HomeUtterAccess Wiki

Welcome Guest ( Log In | Register )

Custom Search
Edit Discussion
> Frustrated Join    
Frustrated Join

A common scenario that frequently crops up is a variant of the question, "Show me only records for where there is no match in other table."

The most intuitive solution is to make use of NOT IN such as this:

CODE
SELECT ...
FROM aListOfRecords
WHERE ID NOT IN (
 SELECT ID
 FROM theOtherTable
);

For small tables this may be adequate but it turns out that it get slower and slower as the numbers of records for either table get bigger and bigger. The reason for this is because Access cannot optimize on NOT IN() as well.

Arguably most common approach that is also likely to be most optimal is to make use of "Frustrated Join". Conceptually, we join two tables together using outer join. By default, an outer join will include any matching entries *and* non matches. We frustrate the join by excluding the matches. Here is a basic template:

CODE
SELECT ...
FROM aListOfRecords
LEFT JOIN theOtherTable
ON aListOfRecords.ID = theOtherTable.ID
WHERE theOtherTable.ID IS NULL;

The criteria "ID IS NULL" is the magic that we need to ensure we obtain only records that do not exist in the other table.

In some corner cases, we may find that we can get better optimizations by using this variant:

CODE
SELECT ...
FROM aListOfRecords
WHERE NOT EXISTS (
  SELECT NULL
  FROM theOtherTable
  WHERE theOtherTable.ID = aListOfRecords.ID
);

NOT EXISTS tends to perform better than NOT IN because the evaluation is done once compared to NOT IN() requiring two evaluations; running a subquery to gather all list of IDs, then going through the list of IDs to determine if a given ID is in fact present in other table or not. Like NOT IN(), Access cannot fully optimize the criteria but in few corner cases where nonmatches is few or nonexistent, this may actually perform faster than a frustrated join.

Experimentation is necessary to determine which variant will perform better; note that the dataset itself influence the performance meaning that if the number of non matches as a relative percentage to total numbers of potential matches changes, it may shift the favor to other variant.

Edit Discussion
Custom Search
Thank you for your support!
This page has been accessed 5,034 times.  This page was last modified 01:11, 10 February 2012 by Jack Leach. Contributions by BananaRepublic  Disclaimers