UtterAccess HomeUtterAccess Wiki

Welcome Guest ( Log In | Register )

Custom Search
Discussion
> Normalizing A Database    
Normalizing A Database

Image:NotifCleanup.gif This page requires general cleanup in formatting or text to better fit the UA Wiki Guidelines
This page is under consideration for merging with: Normalization, Normalization Terms and Concepts

Some 'givens' to remember with all db's:

1) All Primary Key's (PK's) should be Autonumbers. PK's, as the inimitable Jerry Dennison (the true expert around UtterAccess on normalization,) puts it: "PK's are NOT for human consumption." they're meant to uniquely identify the records in a table only. The 'address' of the record in the database, if you'd like. Autonumbers
2) Do NOT use spaces or non-alphameric characters (#,/,?, etc.) in field or object names. Many, many, many headaches lie down that path, so steer clear.
3) Do NOT use 'Reserved' words as the name of fields or objects. Reserved_Words_In_Access , Reserved_Words

The first thing to always ask yourself when designing the structure is "What end product/s do I need?"
Then step back and ask "What data do I need to get to the end product/s?" Looking at this one another way, you could say, "What fields do I need?"
You then group the related fields together into tables, evaluating each field to see if you are at 3NF. If not, break them down, adding fields/tables until you get there.
Until you get comfortable with the process, do it on paper first, before working in an actual database.

Let’s look at a means to track customer orders as your end product. Data needed would include Customer info (name, address, phone number), Inventory (Parts) info (Part #, name, price) and Order info (order date, order #, customer, parts ordered, total price) so you can reach the desired end product.

Looking at the Customer table: Naming_Conventions (tbl in front of each table name, frm for form, rpt for report, subfrm for subform, etc.) would indicate we name it tblCustomers. Remembering Given #1 above, we'll start the table with an autonumber field to identify (id) each record, and name it CustomerID and make it the PK.
The next field to be needed is the name of the customer, a text field.
This gives us:
CustomerID PK Autonumber

CustomerName Text

Four Questions to Ask Yourself

Use this series of questions to ask yourself with each field as you add it, to see if you are normalizing the table as you go:

1) Can the field be broken down further? Are there components to the field, or fields that actually make up the field I just added?
2) Will any other record ever contain this exact same information?
3) Does the field describe ONLY this record?
4) Can a entity of this type (in this case, Customer) have more than one of this field?

If you answer NO to questions 1, 2 and 4, and YES to question 3,you can say, “I'm in pretty good shape to continue to the next field, in normalization terms.” Now we're at the customer address. You figure it to be a text field.
This gives us:

CustomerID PK Autonumber
CustomerName Text
CustomerAddress Text
CustomerPhoneNumber Text

Asking those 4 questions, you decide that:

1) It CAN be broken down further.
Addresses consist of other components: 1 or 2 lines of street address and/or suite # or similar data, city, state and zip code
2) You don't believe any other customer would have the exact same street address, but could and probably will have the same city, state and/or zip code.
3) Debatable, but for this business purpose, you believe that we won't have more than one customer with the same address.
4) Yes, a customer CAN and probably will have more than one address.
A lot of businesses and people have a physical address, a billing address and a mailing address with some or all of these being different.
From these answers you realize that the field needs to be changed. With the fact that question 4 is yes, you can tell fairly easily (in this situation) that you need an additional table, tblAddresses.
So back to the step of deciding what fields are needed. you decide that you need:
CustomerAddressID PK Autonumber
CustomerID FK (Foreign Key) to tblCustomers this links tblCustomerAddresses to tblCustomers in a many to one relationship (one customer - many addresses)
Address1 Text
Address2 Text
City Text
State Text
Zipcode Text (you won't do calculations on the zip, so it should be text)
AddressType Text (physical, mailing, etc)

Going through the same 4 question process, you feel comfortable that the first 3 fields are good to go.
AddressID is the Primary Key and it IS an Autonumber. Address1 and Address2 meet all requirements as well.
Now we get to city, state and zip. More than one address can have the same city, state, zip combination. Looks like we can say NO to the #2 question, a good indicator that another additional table is needed, tblCities.
You remove the three bottom fields and add them to tblCities, with an Autonumber PK, CityID:

CityID PK Autonumber
City Text
State Text
Zipcode Text

Same 4 questions again: you feel good about City and Zipcode, but more than one record could have the same state, indicating a need for yet another table:

tblStates
StateID PK Autonumber
StateAbbr Text (state abbreviation)
StateName Text

No to all 4 questions means this table should be normalized. Now you have to relate it back to tblCities. You add a Foreign Key (FK) to tblCities linking tblCities to tblStates in a many-to-one relationship (Many cities can have the same state).
This gives you:

tblCities
CityID PK Autonumber
City Text
StateID FK to tblStates
Zipcode Text

Now you need to relate tblCities to tblCustomerAddresses. You add a FK to tblCustomerAddresses linking it to tblCities in a many-to-one relationship (Many addresses can have the same city/state/zip).
This gives you:

tblCustomerAddresses
AddressID PK Autonumber
CustomerID FK to tblCustomers
Address1 Text
Address2 Text
CityID FK to tblCities
AddressType Text

Now to the AddressType field. Again with the 4 questions,you decide that more than one record will have the same address type. This again means an additional table.
You decide on:

tblAddressTypes
AddressTypeID PK Autonumber
AddressType Text

After questioning the fields here, you decide the table is good and go back to tblCustomerAddresses to relate the two tables.
You remove the AddressType text field and add a FK, AddressTypeID::tblCustomerAddresses

AddressID PK Autonumber
CustomerID FK to tblCustomers
Address1 Text
Address2 Text
CityID FK to tblCities
AddressTypeID FK to tblAddressTypes (many to one, many addresses can have the same type)

Looking back to tblCustomers, you no longer need customer address as a field in that table, since the relationship with tblCustomerAddresses takes care of this.
This leaves:

tblCustomers
CustomerID
CustomerName
CustomerPhoneNumber

The usual 4 questions gives you the answers that you are going to need another table. One customer can have many phone numbers.
You decide on:

tblPhoneNumbers
PhoneNumberID PK Autonumber
CustomerID FK to tblCustomers (one customer, many phone numbers)
PhoneNumberType Text (mobile, office, home, etc.)
PhoneNumber Text

After analyzing this table with the 4 questions, you see that, again, you must add another table as many phone numbers can be the same type:

tblPhoneNumberTypes
PhoneNumberTypeID PK Autonumber
PhoneNumberType Text

After adjusting tblPhoneNumbers by removing the PhoneNumberType text field and adding the FK to relate to tblPhoneNumberTypes, you have:

tblPhoneNumbers
PhoneNumberID PK Autonumber
CustomerID FK to tblCustomers (one customer, many phone numbers)
PhoneNumberTypeID FK to tblPhoneNumberTypes
PhoneNumber Text

You now have 9 tables versus the original 3, just from analyzing one table!
Tables now are:

tblCustomers
tblCustomerAddresses
tblAddressTypes
tblCities
tblStates
tblParts
tblOrders
tblPhoneNumbers
tblPhoneNumberTypes

Moving on to tblParts, you decide you need these fields:

PartID PK Autonumber
PartNumber Text
PartDescription Text
PartUnitPrice Currency

Analyzing with the 4 questions, this looks pretty good as is. No additional tables needed, unless later you decide to add a PartCategory, such as motors, valves, electrical, miscellaneous, etc. If you did that, you’d need a tblPartsCategory, with a FK in tblParts to relate the two tables.
Moving on to tblOrders, you decide you need:

tblOrders
OrderID PK Autonumber
CustomerID FK to tblCustomers
OrderNumber Text
OrderDate Date/Time
PartID FK to tblParts
QtyOrdered Number
PaymentMethod Text

Analyzing here, you see that you are repeating data (fields) in a lot of records this way, since orders will, a lot of times, have more than one part ordered. This means yet another table. You decide since it line items for an order to name it tblOrderLineItems.
This new table only needs to link to tblOrders, so the customer and order specific data do not go in the new table, only the parts ordered info goes here. You decide on:

tblOrderLineItems
OrderLineID PK Autonumber
OrderID FK to tblOrders (many to one, many line items to each order)
PartID FK to tblParts
QtyOrdered Number

Initial analyzing says you’re ok here. On further thought though, you realize that parts prices can and WILL change over time.
You need to store that amount, to keep an accurate history. Adding that field gives you:

tblOrderLineItems
OrderLineID PK Autonumber
OrderID FK to tblOrders (many to one, many line items to each order)
PartID FK to tblParts
QtyOrdered Number
PartPrice Currency

This leaves tblOrders to look like:

tblOrders
OrderID PK Autonumber
CustomerID FK to tblCustomers
OrderNumber Text
OrderDate Date/Time
PaymentMethod Text

PaymentMethod winds up in its own table after the 4 question analyzation (more than one order can have the same payment method, you would be duplicating data):

tblPaymentMethod
PaymentMethodID PK Autonumber
PaymentMethod Text

Removing the PaymentMethod text field and adding the FK for the relationship changes tblOrders to:

tblOrders
OrderID PK Autonumber
CustomerID FK to tblCustomers
OrderNumber Text
OrderDate Date/Time
PaymentMethodID FK to tblPaymentMethods (many to one, many orders with the same method)

The tables now are:

tblCustomers
CustomerID PK Autonumber
CustomerName
tblCustomerAddresses
AddressID PK Autonumber
CustomerID FK to tblCustomers
Address1 Text
Address2 Text
CityID FK to tblCities
AddressTypeID FK to tblAddressTypes
tblAddressTypes
AddressTypeID PK Autonumber
AddressType Text
tblCities
CityID PK Autonumber
City Text
StateID FK to tblStates
Zipcode Text
tblStates
StateID PK Autonumber
StateAbbr Text (state abbreviation)
StateName Text
tblPhoneNumbers
PhoneNumberID PK Autonumber
CustomerID FK to tblCustomers
PhoneNumberTypeID FK to tblPhoneNumberTypes
PhoneNumber Text
tblPhoneNumberTypes
PhoneNumberTypeID PK Autonumber
PhoneNumberType Text
tblParts
PartID PK Autonumber
PartNumber Text
PartDescription Text
tblOrders
OrderID PK Autonumber
CustomerID FK to tblCustomers
OrderNumber Text
OrderDate Date/Time
PaymentMethodID FK to tblPaymentMethods
tblOrderLineItems
OrderLineID PK Autonumber
OrderID FK to tblOrders (many to one, many line items to each order)
PartID FK to tblParts
QtyOrdered Number
PartPrice Currency
tblPaymentMethods
PaymentMethodID PK Autonumber
PaymentMethod Text

A final total of 11 tables from an original 3!! While this may seem, at first, to be over-complicating things, it will make things so much easier as you work into forms and reports that you will end up actually SAVING time in the long run!
At this point, you realize that you can and should keep a history of the price of parts.
You create a table named tblPartsPrice and use PartPriceID as a FK from tblOrderLineItems instead of a currency field to just hold the data.

tblPartsPrice
PartPriceID PK Autonumber
PartID FK to tblParts
DateofPrice Date/Time
PartPrice Currency

Finally, after all this work on paper (now 12 tables from the initial 3), you are at Third Normal Form (3NF) and ready to enter the actual tables in a new blank database.
Once that is done and you go to the Relationships window and define your relationships there with the appropriate Cascade Update/Delete options, you’re done --for now, until something new is added/the scope of the project changes--with the table structure and ready to look at the user interface to get the data into the db....



This page was originally ported from the UtterAccess Forums. It is based heavily or in part on a post by NoahP.
Normalizing a Database

Discussion
Custom Search
Thank you for your support!
This page has been accessed 8,455 times.  This page was last modified 01:51, 10 February 2012 by Jack Leach. Contributions by NoahP and Cpetermann  Disclaimers