Databases Design: step 1

Document Analysis

The first step in building a database is to identify all the representative documents of the data you want to model.
The term "documents" must be taken here in a broad sense. It can be either paper documents or magnetic documents (recording of interviews, disc or computer diskette)... or any other medium that can be used to store information.

Working documents

To illustrate the various stages of database design, we will use, as an example, the management of students taking the various courses of a degree.

The working documents used are as follows:

List of students
Student number Student name Date of birth Sex
1 Dupont, Charles 18-03-1981 H (Male)
2 Dubois, Jules 02-11-1982 H (Male)
3 Favier, Isabelle 02-02-1979 F (Female)
... ... ... ...
Marks sheet
Student Number : 1     Dupont, Charles
Subject number Name Coefficient Mark / 20
1 Mathématiques 3 10
2 Informatique 2 9
3 Sociologie 2 12,5
4 Histoire 1 13
5 Géographie 1 7
Average mark 10,3    
Teacher's List
Teacher id Name Grade Seniority in rank Subject taught
10 Bertrand, Pierre ASS 2 Sociologie
11 Dupont, Auguste MCF 3 Mathématiques
... ... ... ... ...
15 Simon, Etienne ASS 5 Histoire
Géographie

Management rules:

- A subject is taught by only one teacher.
- A student has only one mark per subject.

Data dictionary

From the documents, the analysis phase consists in extracting the elementary (non-decomposable) information that will constitute the future database.

The collection of all the elementary data, called attributes or fields, constitutes the data dictionary. Each attribute (field) in the data dictionary can be characterized by the following properties:

Property Meaning
Mnemonic An abbreviation for the attribute name.
Wording A label containing the precise meaning and role of the attribute.
Data type Attribute type: integer, real, string, date...
Integrity constraints List of constraints on possible attribute values
Calculation rule The rule for calculating (obtaining) the corresponding attribute.

The data dictionary for the previous documents is as follows:

Mnemonic Wording Type Constraints Calculation rule
Ancien Seniority in rank Integer >=0  
Cdsexe Gender code  String(1) H or F  
Coeff Subject coefficient Integer >0 and <6  
Dtnaiss Date of birth Date    
Grade Teacher Grade String(3) ASS or MCF or PR  
Lbsexe Gender Label String(7) Male or Female  
Moyenne Average mark for the student Real >=0 and <=20 SUM(Note*Coeff)/SUM(Coeff)
Nomat Subject name String(15)    
Nomens Name of teacher String(20)    
Nometu Name of student String(20)    
Note Mark obtained by the student in the subject Real >=0 and <=20  
Numat Subject number Integer >0  
Numens Teacher number Integer >0  
Numetu Student number Integer >0  

Remarks
The data dictionary is independent of the database management system that will be employed to implement the database. To define the data dictionary for a particular DBMS, it will therefore be necessary to translate the "Type", "Constraints" and "Calculation Rule" columns into the specific formalism for the DBMS.
For example, for the "Grade" field, and if you use Firebird, the "Type" column will become Char(3), "Constraints" will become Value in ('ASS','MCF','PR)…

Next stage | DB Design