Full Version: Compare Phone Number to a large list of Phone Numbers
UtterAccess Forums > Microsoft® Access > Access Forms
michaela4
I have a "Do Not Call" list of phone numbers that has about 2.5 million records. My user will be entering phone numbers into a form and I would like the user to be alerted if the number they've entered is on the "Do Not Call" table. I am thinking the beforeupdate property of the phone number control is the best place but what code should I use to run the comparison. By the way, the phone number is being entered on a subform.
Thanks
Michael
freakazeud
Hi,
use a dcount or dlookup expression on the before update event. You can find something very similar to this in this tutorial!
HTH
Good luck
michaela4
Hey Oliver thanks for the help. Here's the code I used in the beforeupdate of the phone txtbox and it seems to search for a very long time but even if I have entered a number that I know is on the Do Not Call table it still allows the entry. Can you improve my code any?
Thanks
If Me.Phone = DLookup("PN", "tbl_DNC_Import", "PN=" & Me.Phone) Then
Me.Undo
MsgBox "The Phone number you entered is on the Do Not Call list.", vbCritical, "Royal Alert"
Else
End If
freakazeud
Hi,
try this instead:
tLinkCriteria = "[PN]=" & "'" & Me.Phone & "'"
If DCount("PN", "tbl_DNC_Import", stLinkCriteria) > 0 Then
Cancel = True
MsgBox("The Phone number you entered is on the Do Not Call list.")
End If
You might need to change the stLinkCriteria quotations depending on if your PN field is a number or text type.
HTH
Good luck
michaela4
Thanks Oliver!
SerranoG
Freakazeud's code suggests that he (she? Can't tell from the user name!) suspects your phone numbers are stored as text and not a number. If that's true then

CODE
If Me.Phone = DLookup("[PN]", "tbl_DNC_Import", "[PN] = '" & Me.Phone & "'") Then
nbsp;   Me.Phone = Null
    MsgBox "The Phone number you entered is on the Do Not Call list.", vbCritical, "Royal Alert"
End If


If it really is a number, try

CODE
If Me.Phone = DLookup("[PN]", "tbl_DNC_Import", "[PN] = " & Me.Phone) Then
    Me.Phone = Null
    MsgBox "The Phone number you entered is on the Do Not Call list.", vbCritical, "Royal Alert"
End If


The Me.Undo will undo the whole record, not just the phone number.

FYI, I'm not a fan of this:

CODE
stLinkCriteria = "[PN] = " & "'" & Me.Phone & "'"

Why not...?

CODE
stLinkCriteria = "[PN] = '" & Me.Phone & "'"

Less cluttered.



Edited by: SerranoG on Thu Jan 26 11:41:23 EST 2006.
freakazeud
You're welcome.
Glad I could assist.
Good luck on future projects!
dallr
Dlookup suppose to be very slow especially with 2.5 million records. Shouldn't a recordset be used in this case?
CODE
Dim cnn As ADODB.Connection
Dim rst As ADODB.Recordset
Dim SQL As String
Set cnn = CurrentProject.Connection
Set rst = New ADODB.Recordset
SQL = "select * from table1"   [color="green"] ' Your table name, queryname or SQL code[/color]
rst.Open SQL, cnn
With rst
    Do Until .EOF
        If rst.Fields("YourTableFieldName") = Me.YourControlName.Value Then
            MsgBox "Phone Number " & Me.YourControlName & " already exist"
            Exit Do
        End If
        .MoveNext
    Loop
End With
End Sub
ScottGem
I agree a DLookup might be slow in this case. They way I would do it is using a query that is filtered for the entry of the phone number. then use a DCount:
If DCount("*","qryDoNotCall") > 0 Then
MsgBox ""Number in Do not call list"
Me.txtNumber = Null
Me.txtNumber.SetFocus
End If
And I would move it to the After Update event.
dallr
For my future knowledge i decided to test to see which one was the fastest. I created 1,085,000 million records and use the above three methods. Dcount, Filtered Query Using Dcount, Resordset.
THere are the timers in seconds.
Slowest : Recordset with an average of between 17, 21 and 15 seconds.
Faster : The filtered query with the Dcount had times of 3, 2 and 2 seconds
Fastest : The Dcount had times of 2, 1 and 1 seconds.
And the gold medal goes to .....................yayhandclap.gif
ScottGem
That surprises me. I'm not surprised that using SQL out performs using a Recordset. That is usually my experience. Which is why I offered it as an alternative. Generally I will always use SQL instead of a recordset. The only time I will use recordset is when I need to loop thru each record. But I am surprised that a Dcount against the full database outperformed a filtered query, even by such a small margin. I'm curious, did you also try a DLookup and how did that compare?
dallr
I definitely surprised me too. I only ran the Dlookup against all records once and it timed 1 second. Once i saw that i did not use the dlookup with the filter. Everything else was ran 3 times using Dcount, Docount with filter and recordset.
Interesting findings!!!!!!!!!!!!
Note to self : It is always good to double check everything you here. As highlighted by the OPRAH show yesterday.
ScottGem
I'm going to have to some of my own test if I can get a large enough table. I've always been told that aggregate functions are slow, but your findings seem to be the opposite.
NoahP
Need more information here. What was the table structure? What are the indexes on the table in question? How, specifically, did you conduct the tests? From VBA, from a query, from an unbound control?
ifferent situations are going to provide different results. I don't believe your test was comprehensive enough to say Domain Aggregates are faster. I need more proof before I'm going to say 'you're right'. I have not conducted any direct tests before but, in my experience I have always thought SQL Select statements to be the fastest option.
Noah
dallr
Well Well Well.
This is proving to be an interesting discussion. Maybe someone should post a tread specfic to this topic and see what it unearths (Discussions/Test Results etc).
ytest:
I just populated one field in a table with the million plus records. Used an unbound control on a form then went from there. Eg
CODE
  
Dim sngStart as single
Dim sngStop as single
Dim lngResult as long
Sngstart = Timer
if Dcount("FieldName","FilteredQueryName") >0 Then
          msgbox" number exists"
end if
SngStop =timer
lngresult =format(sngstop - sngstart,"0.000")
msgbox lngresult

That being said.Even if other factors could and will come into play.I just want to know which one is fastest. Except if the times will change under different conditions, but i personally think it wouldn't.
Nevertheless, I always say what we think or personally feel matters little, but it is want the facts state. So can someone run some more test and see what it shows?
schroep
I did some benchmarking, and thought I would add my results:

First, a few notes:

1) I did not test an Access-SQL connection. A pass-through query in that situation should be far faster than anything else.
2) This test was done using Access 2000.
3) Obviously, the results are dependent on many factors (depending on the test), including the specs of my machine, my network (if applicable), other traffic/applications running/etc., table design, indexes, and so on.

Ocreate a simple Access MDB with one table. This table had two fields in it; an autonumber PK, and a numeric (LONG) data field. I added 1 million rows of unique data (basically, 1 million unique #'s) to this table, and indexed the data field as NO DUPLICATES. The piece of data I was looking for was far down in the list, the 891,323rd record.

I then created very simple routines to pull the data using several different methods, inside a loop of 1,000 iterations:

1) DLookup/DCount WITH criteria clause (direct to table)
2) ADO using SQL statement WITH Where clause (direct to table)
3) DAO using SQL statement WITH Where clause (direct to table)
4) DLookup/DCount using an interim query (no criteria clause)
5) ADO using SQL statement through interim query (no Where clause)
6) DAO using SQL statement through interim query (no Where clause)

The structure of these routines was all very similar:

Start my timer object
Begin loop of 1,000 iterations
Get the value (if using DAO/ADO, this included Opening/Closing the recordset)
End loop
Stop timer
Display result
(if using DAO/ADO, set the object to Nothing)

I didn't do any recordset testing using FIND or FINDFIRST, nor any looping through recordsets, as that would seem to be FAR less efficient.

I then ran my tests through several scenarios:
1) Local code and table (in same MDB)
2) Split code and table (on same PC)
3) Split code and table (over network)
4) Local code and table (in same MDB) with INDEX on data field removed

So here are the results, all in milliseconds (again, this is for 1,000 iterations of each):

Local tables, direct to table using criteria in code
DLOOKUP...3000
ADO...6000
DAO...4000

Local tables, using interim query
DLOOKUP...4000
ADO...8000
DAO...4000

Linked tables on same PC, direct to table using criteria in code
DLOOKUP...24000
ADO...27000
DAO...23000

Linked tables on same PC, using interim query
DLOOKUP...26000
ADO...30000
DAO...24000

Linked tables over network, direct to table using criteria in code
DLOOKUP...49000
ADO...53000
DAO...73000

Linked tables over network, using interim query
DLOOKUP...53000
ADO...54000
DAO...73000

When I switched to the fourth scenario, local tables with no indexes, the performance, of course, was so horrible I stopped testing and reset it to do only 10 iterations. Interestingly, all 3 methods (DLOOKUP/ADO/DAO all took exactly the same time - 25000ms direct to the table and 26000ms using the interim query) -- indicating they all were just doing flat table scans.

Note that I also ran the same tests using a DCOUNT/COUNT setup, with almost identical results.

Conclusions? Well, DLOOKUP/DCOUNT certainly doesn't seem that slow or inefficient; in fact, it performed better in almost every scenario in MY PARTICULAR test. Over a network, ADO seems to be more efficient than DAO; otherwise DAO was faster. The interim query did NOT help performance.

Honestly, I am more than a little surprised at the results. Again, of course, there are so many factors, from version of Access to actual table/index structure, size of table/number of records, etc., that could affect this.

Obviously, one of the downsides to using domain functions in a query or on a form is that they run out of process, since they aren't part of the underlying recordset, and thus there is a lag while they execute (this can be hugely problematic in a query, where that lag would occur for every record). However, in the right situation, they certainly seem usable.

Others mileage may differ -- post if so!
schroep
Dallr-
Just a note that the recordset code lookup you posted is terribly inefficient, as it basically results in a full table scan.
This would be better:
CODE
Dim rst As ADODB.Recordset
Dim SQL As String
Set rst = New ADODB.Recordset
SQL = "select count(*) from table1 where yourtablefieldname=" & Me.YourControlName.Value
rst.Open SQL, CurrentProject.Connection
If rst(0)>0 then MsgBox "Phone Number " & Me.YourControlName & " already exists!"
rst.Close
Set rst = Nothing
schroep
...but, as my tests above would seem to indicate, still not as fast as a DCount!
dallr
Peter let me first thank you for conducting some extensive testing, hats off to you. At the end of the day I was keen to know how the dlookup/Dcount would perform under various conditions and against other methods.
Although my test was not as comprehensive as yours the conclusions remained the same.
The D's can in fact hold there own in the search arena.
S: Thanks for pointing out the recordset I missed that one.
Dallr
datAdrenaline
This is a copy from a thread in the VIP forum that was regarding the same subject ... It was recommended that I copy it to this thread ... May have some good info ... I have not read the rest of the thread due to time so I apologize in advance if this is duplicate info ...
>>> Begin Copy <<<<
I have an Access FE and SQL Server BE .. the domain aggregate functions are SLOW!! ... compared to using a well designed passthru that returns a single record that values can be extracted from... OR ... creating a connection object to the recordsource and executing an SQL statement directly. Also, the Domain Aggregate functions are not real slick on LINKED tables, even with JET as the BE.
Also .. in that 1996 article, the WAY the recordset is created is inefficient ...if your intent is to find One value/sum, then you should use an appropriately formed SQL statement instead of opening the WHOLE table without a WHERE clause...
For Example:
(VERY Efficient!!! --- Faster than DLookup!)
CODE
Public Function GetAreaName(lngAreaID As Long) As String
    Dim rst As ADODB.Recordset
    Dim strSQL As String
    strSQL = "SELECT TOP 1 * FROM tblAreas WHERE AreaID = " & lngAreaID
    rst.Open strSQL, CurrentProject.Connection, adOpenForwardOnly, adLockReadOnly, adCmdText
    If Not rst.EOF Then
        GetAreaName = rst!AreaName
    Else
        GetAreaName = ""
    End If
    rst.Close
    set rst = nothing
End Function

(NOT VERY Efficient!!! --- SLOW!! ... This is BASICALLY what DLookup does)
CODE
Public Function GetAreaName(lngAreaID As Long) As String
    Dim rst As  ADODB.Recordset
    Dim strSQL As String
    strSQL = "SELECT * FROM tblAreas"
    rst.Open strSQL, CurrentProject.Connection, adOpenStatic, adLockReadOnly, adCmdText
    If Not rst.EOF Then
        rst.Find "AreaID = " & lngAreaID
        If rst.EOF = True Then
            GetAreaName = rst!AreaName
        Else
            GetAreaName = ""
        End If
    Else
        GetAreaName = ""
    End If
    rst.Close
    set rst = nothing
End Function

The same illustration could be done with a DAO recordset and the FindFirst ... the MAIN point I am trying to emphasis is the fact that what you pull into your recordset and how you open it can have a HUGE impact on your effecientcy ... I personally put as much as I can in the WHERE clause so the recordset I create is only one record. Also, I have a module of the "replacements" for the Domain Aggregates ... They are very similar to the ones I have seen on UA from Trevor Best.
I did a benchmarking a while back ... with VERY different results from the article ... however ... the article basically compare the usage of FindFirst with DLookup, which seem to be basically the same. Where the article fails is in the LACK of use of a properly defined SQL Statemnent with a WHERE clause... or even an ORDER BY or the use of the SQL Aggregate functions (SUM(), AVG(), etc ....) .... My synopsis is this ... A well defined SQL statement is VERY difficult to beat for speed!
<<<< End Copy
datAdrenaline
... MS must have done some "work" to the D's because "back in the day" they were WAY slower than custom SQL's and such.... ESPECIALLY on LINKED tables!! ... If I have time this weekend, I may try to do some benchmarking using good 'ole A97 .... I will post my results, if I have any to share.
personally am glad to see that the D's are "holding their own"!! ... It makes programming easier!
... One more note ... If you have a BE the resides on a SQL server (MySQL, SQL Server, etc.) ... I contend that a passthru Query or a rst generated from a connection object directly to the server will blow the doors off of the D's.
cheekybuddha
Interesting in Peter's results that ADO performs better than DAO over a network, but horribly on the direct table.
shocked.gif
shrug.gif
Peter, what was the DAO test? using DbEngine(0)(0) or currentdb?
d
schroep
CurrentDB. And for ADO, CurrentProject.Connection. I'll post some of the sample test code Monday.
er the A2002 help file:
Odidn't use DBEngine(0)(0) as I was attempting to simulate more of a real-world scenario, since DBEngine(0)(0) is not recommended to be used. I'll have to test it on Monday to see how much of an effect it has.
schroep
OK. I went back and tested using DBEngine(0)(0) instead of CurrentDB; over 1,000 iterations, no matter what the test was or where the files were located, it would shave 2000ms off the result time.
However, I was able to attain the same speed increase by encasing the loop in a WITH CurrentDB ... END WITH block (since by doing that, it only needed to create the "copy" pointer once, rather than 1,000 times). So, properly structured, any advantage of DBEngine(0)(0) can be nullified. Similarly encasing the ADO block in a WITH CurrentProject ... End With saved about 1000ms on average over 1,000 iterations. Should have done that originally, eliminating the pointer creation as a factor. So, reduce my DAO findings across the board by 2,000ms and my ADO by 1,000ms.
Here's some sample code:
DLOOKUP
CODE
  For lngLoop = 1 To 1000
    lngResult = DLookup("PhoneID", "tblPhoneNums", "PhoneNum=891323")
  Next lngLoop

ADO
CODE
  With CurrentProject
    For lngLoop = 1 To 1000
      rst.Open "SELECT PhoneID FROM tblPhoneNums WHERE PhoneNum=891323", .Connection, adOpenForwardOnly, adLockReadOnly
      lngResult = rst(0)
      rst.Close
    Next lngLoop
  End With

DAO
CODE
  With CurrentDb
    For lngLoop = 1 To 1000
      Set rst = .OpenRecordset("SELECT PhoneID FROM tblPhoneNums WHERE PhoneNum=891323", dbOpenForwardOnly, dbReadOnly, dbReadOnly)
      lngResult = rst(0)
      rst.Close
    Next lngLoop
  End With
schroep
Note that after a little more testing, I am not finding that the changes to the ADO code (using the WITH CurrentProject block) had any actual across-the-board effect, so I would stick with the original ADO times. The DAO changes were consistent.
datAdrenaline
Peter ... any chance of posting your "testing" application? ... I would LOVE to play with it a little ... Also, I would like to test it with an A97 database... using the same code would be beneficial in comparing the numbers... Possibly even throw some Passthru's and direct connections to SQL Server.
schroep
Sure, make me clean up my sloppy test app! laugh.gif
Not a problem, see attached. Not much here, just a module to create a 1,000,000 row local table, a class module for timing, and a module with 6 test routines. Obviously, once the table is built, you can easily create a few copies and link appropriately for testing various scenarios.
datAdrenaline
... I like that one signiture that says ... "Comment my code? ... nope ... Why do you think they call it 'Code'" ... or something like that.
thumbup.gif
markmcrobie
Any further test results on this?
This is a "lo-fi" version of UA. To view the full version with more information, formatting and images, please click here.