Data redundancy is repetition of data or excess data which results in retrieval and modification problems. The following is the various impacts of data redundancy in a database:
· Duplication of effort
· Inconsistency of data in the database
· Occupies large space
· Data storage is tough
· Data retrieval often results in incorrect behavior
· Filtration of data produces wrong results
· Modification of data is a problem
Data integrity is of two fold:
· Firstly specifying the data type of the items, which means that the various data items have been given a representation and a scope for their illustration and existence
· Secondly the uniqueness of the data items which means that the database must have defined rules for maintaining data which is unique in nature to differentiate it from the other data in the database. It promotes the correct retrieval of data, insertion and modification of data and its deletion.
The database approach to control and reduce redundancy would be the normalization of data in the database. The result includes the following:
· Reduces redundancy of data
· Controls the anomalies of insertion, modification and deletion of data in the database
The process is done are many phases, which mean that the normalization process entails the following:
· First normal form (1NF) – The conversion of data items into atomic values and to prevent multivalued dependency. The final outcome is the relations with non-atomic attributes or nested relations
· Second normal form (2NF) – The concept is based on functional dependency. It entails that relations are held in a complete functional form and any deletion of the attribute or set of attributes would mean that the functional dependency does not hold any more. The final outcome would be relations where primary key contains multiple attributes and no non-key attribute should be functionally dependent on a part of the primary key.
· Third normal form (3NF) – The concept is to avoid transitive dependency among the various attributes in the table. The outcome is a relation without non-key attribute functionally determined by another non-key attribute (or by a set of non-key attributes).
Entity integrity is a concept which ensures that there is no duplication of records and primary key value is not null. The primary key defines the very tuple or the row and make sure that all the rows are unique.
Referential integrity is to make sure that coupled relations are consistent and is usually done with primary key in the base table and foreign key in the secondary table. The primary key values in the secondary table would make sure that are referred in the base table.
Primary key is used to uniquely define the tuple in a given relation. It is required to join two or more tables to enforce coupling among two relations. The purposes of each one of them can be laid as below:
· Uniquely defines a row in a particular relation
· Makes sure that correct retrieval of rows are done
· Used to align itself with other relations
· Is never null and not repetitive in nature
Foreign key relates to the primary key in the base table and makes sure that all the values are related to the base table to promote the joining of the two or more tables (Navathe, 2004). The purposes are as follows:
· Relates to the primary key of the base table
· Make contain null values and repeated values
· Correlates with the primary values
Table name: JOB
Primary key: JOB_CODE
Foreign key: none
Table name: EMPLOYEE
Primary key: EMP_CODE
Foreign key: JOB_CODE
Table name: BENEFIT
Primary key: EMP_CODE, PLAN_CODE (single code)
Foreign key: none
Table name: PLAN
Primary key: PLAN_CODE
Foreign key: none
Figure 1 : Benefit Table
Figure 2 : Employee Table
Figure 3 : Job Table
Figure 4 : Plan table
Figure 5: Relationship
Figure 6: Query
Navathe, Elmasri (2004). Fundamentals of Database systems, Singapore: Pearson Education.