UtterAccess.com
X   Site Message
(Message will auto close in 2 seconds)

Welcome to UtterAccess! Please ( Login   or   Register )

Custom Search
 
   Reply to this topicStart new topic
> Capacity Planning: Estimating Accdb Size, Access 2010    
 
   
FrankRuperto
post Apr 8 2020, 12:45 AM
Post#1



Posts: 969
Joined: 21-September 14
From: Tampa, Florida USA


I have a voter registration file for the entire State of Florida USA.
Its a public domain ~1.48gig csv file that contains 8,684,949 records.
When I imported the csv into Access, the accdb size grew to ~1.72gig.
The text fields in the csv file did not have any padded trailing spaces.
So I'm wondering why there's 204 extra megs in the accdb?
Is it the index on the primary key, and extra internal overhead?
I did a Compress and the size remained the same.
Is there an equation for estimating accdb size based on the size of an imported csv file?
I am planning on creating lookup tables for the City, State, ZipCode, Gender, Party and Status fields so I can just store the numeric lookup table ID values in number fields instead of text fields to save space.
I will have to do some data scrubbing, using a ZipCode lookup table, to correct mispelled City names so they match the city lookup table values.
This post has been edited by FrankRuperto: Apr 8 2020, 01:02 AM
Attached File(s)
Attached File  tblVoter.PNG ( 33.53K )Number of downloads: 8
 

--------------------
Currently supporting pawnbrokers that use my store management system developed with Access 2010 on Windows7. Experienced with Informix, Oracle & PostgreSQL db's.
Go to the top of the page
 
vtd
post Apr 8 2020, 01:28 AM
Post#2


Retired Moderator
Posts: 19,777
Joined: 14-July 05
From: Sydney NSW Australia


>>Is it the index on the primary key, and extra internal overhead?<<

That sounds about right... Indices do take up a fair bit of space, espcially for Tables with a large number of records.

See UtterAccess Thread: Determine Table Size.

Have you tried using SQL Server or SQL Server Express to store your data as an SQL Server database? This give you a bit more room than the 2 GB limit of Access.

Go to the top of the page
 
FrankRuperto
post Apr 8 2020, 06:57 AM
Post#3



Posts: 969
Joined: 21-September 14
From: Tampa, Florida USA


Figured it was the index, hidden system tables, etc.
I currently have this data stored in PostgreSQL which has no table/db size limitations and can do fancy things like partitioning a table into multiple logical tables based on expression, e.g. partition all inactive voters.
I loaded this data into Access just to see how it performs compared to PostgreSQL and other db servers like Oracle and Informix.

--------------------
Currently supporting pawnbrokers that use my store management system developed with Access 2010 on Windows7. Experienced with Informix, Oracle & PostgreSQL db's.
Go to the top of the page
 
isladogs
post Apr 8 2020, 08:33 AM
Post#4


UtterAccess VIP
Posts: 2,309
Joined: 4-June 18
From: Somerset, UK


I had a similar situation when importing a 1.5GB CSV file of almost 3 million UK Postcodes to ACCDB. The file size increased to around 1.8GB almost all of which was in one table.
Although its a reference datafile and so never edited by end users, I was concerned enough to try and reduce its size.
I managed to reduce it by over 210MB.
I did remove some indices but only as a last resort as I wanted to ensure searches were fast.
Most of the reduction was by ensuring the datatypes used were as small as possible for each of the 50+ fields
E.g. For numbers I used byte instead of integer, single instead of double etc where feasible.
For text files, I made the field size as small as I could so had field sizes of e.g. 43 rather than 50.
Every little helped!

BTW the system tables take up little space . Usually not much above 500 kB (0.5 MB) which is the size of a brand new blank database containing only system tables


--------------------
Colin (Mendip Data Systems)
Website, email
Go to the top of the page
 
FrankRuperto
post Apr 8 2020, 09:23 AM
Post#5



Posts: 969
Joined: 21-September 14
From: Tampa, Florida USA


Hi Colin,

QUOTE (isladogs)
For text files, I made the field size as small as I could so had field sizes of e.g. 43 rather than 50.

Access uses variable length text fields, so does it make a difference if I reduce the text field size?
The average length of the names and addresses is about 15 chars each.

Florida has 412 municipalities so a City lookup table should significantly save space by only storing the lookup table CityID value in the voter table. The Party and Status values are 3 chars and the State values are all 2 chars, so I can also use a lookup table for those to save a little more space. I do need to break out the house number from the StreetAddress field into a new HouseNumber field so I can properly sort by StreetName and HouseNumber fields.

I will also have to include a State field in the City lookup table, as this db will receive voter registration feeds from other U.S. States. The purpose of this db is to identify if a same person is registered to vote in more than one State. If I put all records of registered voters from different States into one voter table, then I will have to use a PostgreSQL backend and use self join queries matching on as many fields possible to pinpoint duplicate voters.
This post has been edited by FrankRuperto: Apr 8 2020, 09:40 AM

--------------------
Currently supporting pawnbrokers that use my store management system developed with Access 2010 on Windows7. Experienced with Informix, Oracle & PostgreSQL db's.
Go to the top of the page
 
FrankRuperto
post Apr 10 2020, 12:46 AM
Post#6



Posts: 969
Joined: 21-September 14
From: Tampa, Florida USA


Is this an accurate list of 32-bit accdb supported datatypes and the space each occupies?

BOOL = 0x01 /Boolean ( 1 bit )
BYTE = 0x02 /Byte ( 8 bits), n= -255..255
INT = 0x03 /Integer (16 bits), n= -32767..32767
LONGINT = 0x04 /Long Integer (32 bits), n= -4,294,967,295..4,294,967,295
MONEY = 0x05 /Currency (64 bits)
FLOAT = 0x06 /Single (32 bits)
DOUBLE = 0x07 /Double (64 bits)
DATETIME = 0x08 /Date/Time (64 bits)
BINARY = 0x09 /Binary (0..255 bytes)
TEXT = 0x0A /Text (0..255 bytes)
OLE = 0x0B /OLE = Long binary
MEMO = 0x0C /Memo = Long text (0..64K bytes)
REPID = 0x0F /GUID
NUMERIC = 0x10 / Scaled decimal (17 bytes)
This post has been edited by FrankRuperto: Apr 10 2020, 12:57 AM

--------------------
Currently supporting pawnbrokers that use my store management system developed with Access 2010 on Windows7. Experienced with Informix, Oracle & PostgreSQL db's.
Go to the top of the page
 
vtd
post Apr 10 2020, 02:17 AM
Post#7


Retired Moderator
Posts: 19,777
Joined: 14-July 05
From: Sydney NSW Australia


Some of the limits posted are incorrect, e.g. BYTE goes from 0 to 255, INT from -32768 to 32767, etc...)

For more details, see Microsoft article: Introduction to data types and field properties.

The article doesn't mention about the limit of Memo Field but IIRC, Memo Field can be up to 2 GB in length (but it is an impossible limit, of course) and stored separately from the Table where the Memo Field is used. In the "host" Table, Memo Field will use 16 bytes as the pointer to the the Memo value.


Go to the top of the page
 
isladogs
post Apr 10 2020, 02:40 AM
Post#8


UtterAccess VIP
Posts: 2,309
Joined: 4-June 18
From: Somerset, UK


Also look at Access specifications

For info, the number of characters in memo/long text fields can be up to
QUOTE
65,535 when entering data through the user interface; 1 gigabyte of character storage when entering data programmatically

--------------------
Colin (Mendip Data Systems)
Website, email
Go to the top of the page
 
FrankRuperto
post Apr 13 2020, 09:01 AM
Post#9



Posts: 969
Joined: 21-September 14
From: Tampa, Florida USA


Thanks Colin, didn't know you could store up to 1GB chars in long text fields using vba. Can users scroll through a long text field in a form to view more than 64K of text?

I am also missing BIGINT/LongLongInt datatype from the above list and assuming it needs 8 bytes of storage, but I don't think the ACE engine natively supports these and only recognizes them when linking to db servers like SQL-Server.

--------------------
Currently supporting pawnbrokers that use my store management system developed with Access 2010 on Windows7. Experienced with Informix, Oracle & PostgreSQL db's.
Go to the top of the page
 
isladogs
post Apr 13 2020, 11:28 AM
Post#10


UtterAccess VIP
Posts: 2,309
Joined: 4-June 18
From: Somerset, UK


Hi Frank
Yes you can scroll a textbox containing a long text field. For example,

Attached File  Capture.PNG ( 52.23K )Number of downloads: 3


My commercial JSON app uses this feature to view the contents of JSON files some of which are very large and can contain several hundred million characters

--------------------
Colin (Mendip Data Systems)
Website, email
Go to the top of the page
 


Custom Search


RSSSearch   Top   Lo-Fi    25th May 2020 - 10:37 PM