CIS4365: Database Applications
Fall, 2017

What is Database Integrity? 

Integrity

 

1.      Steadfast adherence to a strict moral or ethical code.

2.      The state of being unimpaired; soundness.

3.             The quality or condition of being whole or undivided; completeness.

 

The data within our databases must constantly follow the rules and constraints placed within the data models. Without constraints, the data would hold no meaning. The reliability and integrity of that data would always be in question and the users of such data would always question the validity of it. If a database were to neglect the rules and overstep the boundaries that were placed upon it, there would be mayhem within the database and that database would cease to exist for any value.  Data Integrity is an umbrella term that refers to the consistency, accuracy, and correctness of data stored in a database. Think of data integrity in terms of the old adage: 'garbage in, garbage out;' data integrity is about keeping the garbage out.  

Types of Data Integrity

There are four primary types of data integrity: entity, domain, referential, and user-defined.  In general terms; entity integrity applies at the row level; domain integrity applies at the column level; and referential integrity applies at the table level.

Entity Integrity

Key Points

Constraints

Entity integrity ensures that each row in the table is uniquely identified

Entity integrity is most often enforced by placing a primary key (PK) constraint on a specific column (although it can also be enforced with a UNIQUE constraint, a unique index, or the IDENTITY property)

Entity integrity ensures a table does not have any duplicate rows

A PK will not allow any NULLs to be inserted into the column(s) (A NULL entry would be disallowed even if it would be the only NULL in the column and therefore unique.)

Entity integrity ensures that each row in a table has a unique identifier that allows one row to be distinguished from another

A composite PK is a PK that consists of more than one column; it is used when none of the columns in the composite key is unique by itself

Physical independence is achieved by being able to reference each row by a unique value, sometimes referred to as a 'key'

There can be only one PK per table 

 

 

Example: If the ' student ID' column is the PK in the 'students' table, two students cannot have the same student numbers, this would violate entity integrity.

Table Student

Student ID

Name

Address

Major

123456789

Saenz, Lupe

123 Mesa

Accounting

234567890

Chung, Mel

456 Spring

Marketing

123456789

Adams, John

548 Hague

Economics

456789012

Elam, Mary

156 Law

Education

 

Baker, Chris

555 Austin

Nursing

 

Domain Integrity

 Key Points

Constraints

Domain integrity requires that a set of data values fall within a specific range (domain) in order to be valid

Data types limit fields to broad categories (e.g., integers)

Domain integrity defines the permissible entries for a given column by restricting the data type, format, or range of possible values

A default is a definition of a value that can be inserted into a column; a rule is a definition of acceptable values that can be inserted into a column

A domain in database terminology refers to a set of permissible values for a column

 

Domain integrity is sometimes referred to as 'attribute' integrity

 

Example:  Domain integrity might be used to ensure an entry in the 'age' field is an integer and must be between the values of 17 and 75.  In this example, integers between 17 and 75 are the only permissible entries for the ‘age’ field.

Table Student

Customer ID

Name

Address

Age

123456789

Saenz, Lupe

123 Mesa

23

234567890

Chung, Mel

456 Spring

16

477568995

Adams, John

548 Hague

18

456789012

Elam, Mary

156 Law

83

 

Baker, Chris

555 Austin

29

 

 

Referential Integrity

Key Points

Constraints

Referential integrity is concerned with keeping the relationships between tables synchronized

An Foreign Key (FK) is a column or combination of columns in one table (referred to as the 'child table') that takes its values from the PK in another table (referred to as the 'parent table')

Referential integrity is typically enforced with a Primary Key (PK) and Foreign Key (FK) combination

In order for referential integrity to be maintained, the FK in the 'child' table can only accept values that exists in the PK of 'parent' table 

Once this relationship is established, it is possible to 'relate' or 'tie' each order to a particular customer

Enforcing referential integrity means the relationship between the tables must be preserved when records are added (INSERT), changed (UPDATE), or deleted (DELETE)

Note that while PK-FK combinations represent logical relationships among data, they do not necessarily limit the possible access paths through the data

 

The primary objective of referential integrity is to prevent 'orphans;' i.e., records in the child table that cannot be related to a record in the parent table

 

Example: To maintain referential integrity, the PhysID in the ‘patient’ can only accept values that exist in the PhysID of the ‘physician’ table. You cannot change a PhysID in the 'physician' table if that change would produce an 'orphan’; in other words, that change would leave records in the 'patient' table that did not reference to a valid PhysID. 

PatID

Name

 

 
Physician Table                                   Patient Table 

PhysID
Name

Address

PhysID

 
Speciality 

Address

User-Defined Integrity

bulletUser-defined integrity refers to specific business rules not covered by the other integrity categories.
bulletIt is typically implemented through triggers and stored procedures.

 

To get more information on Data Integrity, visit:

http://www.frick-cpa.com/ss7/Theory_DataIntegrity.asp

http://www.databasejournal.com/features/oracle/article.php/3083291

http://www.utexas.edu/its/windows/database/datamodeling/rm/rm5.html

 

You should be able to answer the following questions.

 

1.  What is Data Integrity?

Data Integrity is an umbrella term that refers to the consistency, accuracy, and correctness of data stored in a database.

2.  What are the different types of Data Integrity?

There are four primary types of data integrity: entity, domain, referential, and user-defined. 

3.  When dealing with the relation of tables, what type of data integrity is being enforced?

Referential Integrity

4.  If two (2) separate students had the same primary key, would there be a violation of data integrity, if so, what type of integrity is being violated?

a)     Entity Integrity

b)     Domain

c)      Referential

d)     User-Defined

 

(Answer: a)