X   Site Message
(Message will auto close in 2 seconds)

Welcome to UtterAccess! Please ( Login   or   Register )

Custom Search
   Reply to this topicStart new topic
> Normalization: Requoted Quotes    
post Jan 23 2004, 11:17 PM

UtterAccess VIP
Posts: 973
Joined: 14-February 03
From: Swamps of Connecticut U.S.A.

If you look at my avitar ,you may wonder what it represents. We'll it's me, trying to find all the information I needed to understand normalization and design concepts. It seems that although there is a wealth of information here and around, it's hard to put it into perspective. If I wanted to study the subject thoroughly, it required going to a half dozen different places. (see avitar).

So I compiled all the related subject matter from unrelated sources, sliced them and diced them , and reassembled them into a condensed composite. There are less than 1 dozen words that are my own contained within, and the credit goes to the various contributors. Predominately it's Jerry Dennison (re-quotes, interspersed with quotes from Graham Thorpe of microsoft-accessolutions and datamodel.org . Many Thanks to them.

CAn index of the complete sources of information can be found
Suggested Readings & Things in FAQA

However my mix and match does not represent a verbatim representation of the context from which they were copied. And an apology if I have unknowingly offended anyone by making such a compilation.

A Table is in first normal form (1NF) if there are no repeating groups

Eliminate Repeating Groups- Make a separate table for each set of related attributes, and give each table a primary key.

All Attributes must be atomic. This means that there should be no repeating groups within a field(i.e. multiple values within a field separated by a comma or other delimiter).

Table Requires a PrimaryKey

Guidelines for Primary Keys
Value must be unique
It can never be null
Cannot be a multi-part field
Comprised of the minimum number of fields to ensure uniqueness
Not optional in whole or in part
Directly identifies each value of the remaining fields in a given record of a table.

In order for a field to become a Primary Key for it's parent table, it must pass each of these guidelines. Failure to pass even one will disqualify it as a possible Primary Key. If you make certain you have established a valid Primary Key, you will greatly reduce the possibility of encountering problems later when you begin to work with the table relationships.

A Glossary of Terms & Definitions , Courtesy of Jerry Dennison

It appears that many new users (and maybe an old one or two) may not know all of the terms used in many of the answers given. I thought I might try to post my definition of a few of these. They are in no particular order (only as I think of them).

ENTITY: n. A physical or virtual object that exists as a unique and discreet unit. It cannot be broken down or have anything removed from it without losing the integrity of the entity. (for example, an automobile that comes off the assembly line is an entity (it can be uniquely identified and is not whole if any part of it is removed), it is itself made up of other entities (all of it's myriad parts that are sometimes made up of other parts) each of which is an entity unto itself (it would lose integrity of what defines it as a part if anything were removed). A table should represent a single entity, so a table is in fact the entity.

ATTRIBUTE: n. A quality or characteristic inherent in or ascribed to someone or something. Another common name for an attribute is DESCRIPTOR. An attribute describes some aspect of an entity. Usually, it is one or more attributes that fully define and characterise the uniqueness of an entity. When this happens, this attribute (or combination of attributes) is said to be a "natural key".

KEY: n. The key represents the entity. In fact, it can be said that the key is the entity (or at least the database's representation of the entity)

CANDIDATE KEY: n. One of several possible attributes or combinations of attributes which can be used to uniquely identify a body of information (entity). Generally speaking, candidate keys should rarely be used as the PRIMARY KEY.

PRIMARY KEY: n The CANDIDATE KEY selected that uniquely identifies a body of information (entity). In Access (or any other RDBMS) for that matter it is usually desirable to use a single field as the primary key. This makes it much easier to link related tables to the entity. It makes it much easier to ensure uniqueness and (with Access) facilitates the updating and deleting of records in related tables.

RELATIONSHIP: n. The condition or fact of being related; connection or association. In database terms this is usually relegated to dependency. Access only directly supports one-to-one and one-to-many relationships. But usually it can always be said that one table (entity) is the "parent" of another table (entity).

FOREIGN KEY: n. Attributes contained within a database record (or other collection of related data items) which uniquely identify another record, but not the one within which they are contained. A foreign key is a reference from one record to another. In Access you create a foreign key by defining a field in the "child" table that will hold the value of the primary key of another table. This is how a related record in one table "knows" which parent record it is related to. All foreign keys must be the same datatype as the parent primary key.

Second Normal Form:
Must be in First Normal Form <---- the forms are heiarchical, each is dependent on the one before it
A RELATION is in second normal form if each ATTRIBUTE is fully functionally dependent on the ENTIRE primary key <---- this means that no subset of the key can determine an attribute's valu

Eliminate Redundant Data- If an attribute depends on only part of a multi-valued key, remove it to a separate table.

Eliminate Columns not dependent on the Key- If attributes do not contribute to a description of the key, remove them to a separate table.

Look for values that occur multiple times in a non-key field. This tells you that you have too many fields in a single table.

Functional Dependency & others:
Functional dependency (a tough one, the formal definition of Functional Dependence is: For any relation R, attribute A is fully functionally dependent on attribute B if, for every valid instance, the value of B determines the value of A.), then of course there's multivalued dependency, trivial dependency, and last but not least candidate key.

Third Normal Form:
Must be in Second Normal Form
A RELATION is in third normal form when no non-key attribute is dependent on any other non-key attribute <---- this and 2NF are the primary Forms that prohibits the storage of calculated values or transitive dependencies.

Transitive dependency is a type of functional dependency in which the value of a non-key field is determined by the value in another non-key field and that field is not a candidate key.

Now, to boil all of this down to something usable. You must ask yourself the following:

Am I repeating groups? These can be multiple values in a single field, repeated TYPES of fields that share the same datatype and a common root name or root structure (these are the multiple date fields in Drew's db), or repeating tables of the same entity type (these can be noticed because they generally have the same fields but in different tables).

Am I trying to store derived or calculated values? (we should all know by now not to store calculated values)

Do I have multiple tables with more than one index based on multiple fields? (this usually indicates combining of entities, remember: an table is an entity and an entity a table)

Do I have a large number of values being repeated in a column (field)? This does not include FK's linked to another table's PK. It does include just about anything else. If you find that you're repeating a lot of values then you MAY need to move this to it's own table. This particular question is lowest in priority.

Must be in Third Normal Form and all determinates are candidate keys
All candidate keys must satisfy the test for third normal form <---- a candidate key is of itself a potential unique identifier of the entity, generally speaking candidate keys are mutli-field constructs. This does not mean you should use a candidate key as the PK, it means that it could satisfy the requirements of uniqueness. For most entities, there are many candidate keys.

Boyce-Codd normal form(BCNF) can be thought of as a "new" third normal form. It was introduced to cover situations that the "old" third normal form did not address . Keep in mind the mean of a determinant(determines the value of another field)and candidate keys(qualify for designation as Primary Key). This normal form applies to situations where you have overlapping candidate keys. If a table has no key fields, it is automatically BCNF.

Fourth Normal Form
Must be in 3NF/BCNF
There can be no nontrivial multivalued dependencies in a relation <---- This is a fairly common reduction that most people achieve without even knowing it. This form prohibits independent multivalued components of the key. For example, if an employee can have many skills and many dependents you would move the skills and dependents to separate tables as they are not related in any way.

Fully Qualified Naming : Building Attributes, By Jerry Dennison

Attribute = a feature or description that is specific to an entity. I've already given examples of attributes above, so let's examine a relatively easy way to "build" valid attributes. Fully Qualified Naming. This is the easiest and best method for helping you define your entities and their relevant attributes. A fully qualified name for an attribute includes everything the attribute is describing. I'll use your Work Orders as an example:

WorkOrder ID
WorkOrder Number
WorkOrder Date Entered
WorkOrder Date Completed

WorkOrder LineItem Item ID
WorkOrder LineItem Item Quantity
WorkOrder LineItem Item Price
WorkOrder LineItem Item Discount
WorkOrder LineItem Item Discription

I've separated the two entities into their "natural" groups but name. You could call LineItem OrderDetail. You'll notice that a third group appears, called Item. This is in fact another entity that has emerged and should be in it's own table and the OrderDetails (LineItems) is really an intermediate table that helps define a many-to-many relationship between WorkOrder and Items. The final structure would then look something like this:

WorkOrder Table
Primary Key
Date Entered
Date Completed

Item Table
Primary Key

WorkOrderDetail Table
Primary Key
WorkOrderID (foreign key link to WorkOrder)
ItemID (foreign key link to Item)
Price (note: this would be ok to store duplicate values as prices are subject to change and should be specific to the order detail)
(note: do not include item description or other attributes from other tables that are not specific to the work order detail entity)

Go to the top of the page

Custom Search
RSSSearch   Top   Lo-Fi    19th October 2018 - 05:05 PM