My Assistant
![]() ![]() |
|
|
Apr 27 2012, 11:05 AM
Post
#1
|
|
|
UtterAccess Enthusiast Posts: 62 |
I'm in the process of construcing a somewhat large db and have run into issues while running code. Basically, as the code runs, it's bloating my db so much that it exceeds the 2GB limit and crashes. So, I've split the db into a FE and BE. I'm still experiencing the issue of the BE file bloating to the point of corruption. At it's core, what I'm trying to do is; query against a population of roughly 600K records and identify "exceptions". "Happy Path" would be to be able to have exceptions and non-exceptions reside in the same table so that I can do summary and comparison reporting, but I don't think that's going to work because of the size. So, I was thinking that I would take the exceptions and create a table just for them. Once I append the exceptions from tblA to tblB, I could delete the records from tblA. I believe I'll still have the bloating issue. So I was also wondering if I should (if it's possible?) build a couple of compact and repair points w/in the process.
I'm hoping that all of this makes sense. |
|
|
|
Apr 27 2012, 11:25 AM
Post
#2
|
|
|
UtterAccess VIP Posts: 4,599 From: From Hawaii - Now in Wisconsin...Am I Nuts? |
Could you segregate the fields into separate tables, so it's normalized? This would remove a great deal of your data and reduce the size of the database. You would still have the same results, it would just be normalized.
|
|
|
|
Apr 27 2012, 11:30 AM
Post
#3
|
|
|
UtterAccess VIP Posts: 4,236 From: Dayton, OH |
Adding another table with the same structure is not the way to go. I have to ask, what in your code is causing the bloating? Can you provide the code you are using? Is your database properly structured according to normalization rules? Can you provide the structure?
|
|
|
|
Apr 27 2012, 11:32 AM
Post
#4
|
|
|
UtterAccess VIP Posts: 4,209 From: Columbus, OH USA |
what type of data are in the 600k records? That would be my first question. I have several databases with close to a million records that don't approach .5GB let alone the 2GB limit. If you have any OLE object fields maybe consider if the new attachment data type would work better as that is suppose to help with bloat.
Normalizing data may also help if possible. If this idea, you could have the raw data in 1 BE and move it to normalized tables in a 2nd BE (if bloat is still an issue). This is a wild shot in the dark but something I have done myself when dealing with data imports and needing to break it down. Hope this helps, |
|
|
|
Apr 27 2012, 11:34 AM
Post
#5
|
|
|
UtterAccess VIP Posts: 7,589 From: South coast, England |
Hi reberryjr
What data are you storing? 600K records isn't overly huge. As Dan suggests your db size will improve with normalising your db, but are you storing attachements, or do you have any images in forms or reports? If so you could reduce your file size considerably by keeping those images and attachments as files in a separate folders and loading them when required. hth |
|
|
|
Apr 27 2012, 11:49 AM
Post
#6
|
|
|
UA Admin Posts: 19,241 From: Newcastle, WA |
Bloating happens when you add and delete records. If your workflow is based on adding large numbers of records to one table and deleting them from the other, your database will bloat fast.
As has already been suggested, proper normalization is a big element in getting this under control. |
|
|
|
Apr 27 2012, 12:10 PM
Post
#7
|
|
|
UtterAccess Enthusiast Posts: 62 |
Thanks for all of the replies. First let me state that I'm not a guru, so I'm not certain that I TRULY understand the concept of normalization. The data in the db is strictly text, number and date fields. No images, links, etc. Unfortunately, I can't really include the structure of the db (at least the way I define the term "structure"), so I'll do my best at describing it.
The primary key is a loan number. Each loan number as a myriad of attributes in the Main table. All in all, there are 144 columns (Why do I get the funny feeling that this is where you all roll your eyes and say, "Duh! That's your issue!"?). Unfortunately, I need every single one of the columns throught the "exception" finding process. However, I don't need them all at the same time. Sooooooo, I'm thinking that the next suggestion is going to be for me to identify "groups of columns" that are similar and isolate them in their own tables, tied to the loan number. So, if I have 8 different fields that only pertain to "Tasks", I should create a table that is called tbl_Tasks and query against it only when I need it. Not sure if that made any sense or if I'm on the right track. |
|
|
|
Apr 27 2012, 12:18 PM
Post
#8
|
|
|
UtterAccess VIP Posts: 4,236 From: Dayton, OH |
You are on the right track. If a loan has related tasks, then you would have a table that holds the basic loan information, then in a related table you would have the tasks. Similar to this:
tblLoan -pkLoanID primary key, autonumber -txtLoanNumber tblLoanTasks -pkLoanTaskID primary key, autonumber -fkLoanID foreign key to tblLoan (relates the task to the loan) -taskDate -taskdescription With 144 fields (columns), I am guessing that you have several opportunities for normalization. You may want to check the wiki on this site for more on normalization. Normalization is the key to any successfull database application. I would also look for repeating groups of fields (usually identified with sequential numbers), they are also a sign that the database is not normalized. For example: recdate1, recdate2, recdate3 etc. |
|
|
|
Apr 27 2012, 12:24 PM
Post
#9
|
|
|
UtterAccess VIP Posts: 7,589 From: South coast, England |
Hi reberryjr
Your 144 columns could very well be the problem, Access will allocate space for these columns even if the majority of the time they hold no data, hence causeing bloat. 144 columns does sound as if your db is not normalised, often all these columns can be replaced by (e.g.) a single additional table with three fields, The source table ID for the original record, The data column and a data type column that describes what the data is in the data column. hth |
|
|
|
Apr 27 2012, 12:26 PM
Post
#10
|
|
|
UtterAccess Enthusiast Posts: 62 |
Thanks. I'll check that site out. My data comes in as a result of an Oracle query that I run, so I wouldn't want to set the pk to an autonumber. I started thinking about how I'm going to do this and I went back to George's post about bloating ocurring when you add/delete records. So, if I bring in my 6 .txt files in a transfer macro and make 1 table out of them, I'm going to need to then run make table queries to create these "normalization" queries. Won't that increase the blotation factor (I know it's not a word, but it made me laugh)?
|
|
|
|
Apr 27 2012, 12:29 PM
Post
#11
|
|
|
UtterAccess Guru Posts: 594 |
Hi,
Can you give us more detail on what the code does please? In addition to the 144 columns issue there may be other stuff going on aswell. JonSmith |
|
|
|
Apr 27 2012, 12:38 PM
Post
#12
|
|
|
UtterAccess Enthusiast Posts: 62 |
It's mainly If statements. There is one module that does 3 DateDiff calcs. The If statements look like this...
If IsNull(rs![str_TaskComments]) = True And rs![str_LitCoded] = "Yes" Then rs![str_TaskValid] = "No" rs![str_TaskComments] = "Lit Coding Present" End If There are probably going to be 50+ If statements. |
|
|
|
Apr 27 2012, 12:49 PM
Post
#13
|
|
|
UtterAccess Guru Posts: 594 |
If fields like str_TaskValid are just yes/no but are set to text then they will bloat, default for string is 50 characters and will take up much more space then a yes/no field
|
|
|
|
Apr 27 2012, 12:56 PM
Post
#14
|
|
|
UtterAccess Enthusiast Posts: 62 |
Hmmmm. I've received conflicting information about the size of the text fields. Some people have told me that it doesn't make a difference if the field is set to 255 or 5. Do you disagree?
|
|
|
|
Apr 27 2012, 01:02 PM
Post
#15
|
|
|
UtterAccess Guru Posts: 594 |
I'm not red hot on things like that so the other people are probably correct on the 50 char to 250 char size difference. What I will say with for certain however is that a string field is alot bigger than a numeric field (which is what a true false boils down to, -1 or 0)
|
|
|
|
Apr 27 2012, 01:55 PM
Post
#16
|
|
|
UA Admin Posts: 19,241 From: Newcastle, WA |
"I'm thinking that the next suggestion is going to be for me to identify "groups of columns" that are similar and isolate them in their own tables, tied to the loan number."
EXACTLY. That's a good high level description of the purpose and goal of normalization. The Defined size of a text field is, in a certain sense, irrelevant. If you have 12 chars in a field, that's how much space it occupies. If the field is defined to be 12 characters it will hold that string of data exactly. If the field is defined to be 255 characters,it will still only hold those 12 characters. There is no "wastage". Yes/No fields occupy 1 byte. Here is a list of good reading on normalization. It'll help you to get a solid footing under your tables.
|
|
|
|
Apr 27 2012, 06:38 PM
Post
#17
|
|
|
UtterAccess VIP Posts: 2,540 From: Parma, Idaho, US |
Hmmmm. I've received conflicting information about the size of the text fields. Some people have told me that it doesn't make a difference if the field is set to 255 or 5. Do you disagree? Well... yes and no. The size that the table will occupy after compaction is in fact just the size of the actual text. But I'm pretty sure that a MakeTable or Append query will reserve the maximum size, just in case the data is that big. This is certainly true for importing Text files, since Access has no way to predict the content. I've seen databases get much bigger than expected - if I recall aright, I once imported a 10K text file and the database expanded by over 100K. It shrank right back down after compaction. |
|
|
|
Apr 28 2012, 12:52 PM
Post
#18
|
|
|
UA Admin Posts: 19,241 From: Newcastle, WA |
I'm not sure I can agree John. In a recent series of tests discussed on one of the MVP private forums, I discussed my results from duplicating large tables (300K+) records within a single database using make table queries. In one of the steps, I created a second table with text field sizes increased to the maximum 255 chars from fields with original field sizes of 14, 16 and 1 characters respectively. I then appended 300,024 records from the first table with the original field sizes into the second table. The file size approximately doubled, as I would predict. It is true that a C&R after that step removed a small amount of space from the accdb after the duplication from one table to another. An amount in the neighborhood of 1/2 of 1% of the total file size was reduced. I then deleted all of the records from the second table. As we all know, simply deleting records doesn't affect file size. It remained the same. After a C&R on that file, however, it returned to the original size. These accdbs are @21,380 KB to @40,272 KB in size depending on whether they hold 300K or 600K records.
In short, I'm not convinced that, even with a new table def being created with much larger text field sizes, we can attribute significant bloat to the presence of the increased text field sizes per se. I haven't done similar tests on file imports, so I don't have any relevant information there. |
|
|
|
May 1 2012, 08:33 AM
Post
#19
|
|
|
UtterAccess VIP Posts: 1,886 From: UK |
i think as far as the original question goes - it probably is a normalisation issue.
even if the data is coming from oracle in a un-normalised form, (quite likely by a query /extract that deliberately does os) it's part of your job to analyse the data into normalised tables, by appropriate queries. I think arbitrarily determining that you do not want to use autonumbers is a case in point. OK your data analysis will identify sub-sets of data within the download which can be separated off into tables of their own. Now it is certainly a moot point here whether to use an autonumber key, or use the real-world keys that formed the analysis. Personally I do like to use single-field numeric keys, even at the marginal overhead of an additional key. But it sounds like you are not used to doing this analysis exercise - and your table design is most likely the cause of the bloat. for instance, one way is to import (or link) the next data download into your access database. now you should be able to identify a) new items - use append queries to get these into your tables b) same items - nothing needed c) modified items - use update queries you do not need to be creating/deleting the main storage tables. hope these ideas help. |
|
|
|
![]() ![]() |
|
Go to Top · Lo-Fi Version | Time is now: 22nd May 2013 - 12:41 PM |