Databases design
This presentation is not a comprehensive course on database design, but rather a preliminary approach to clarifying the processes and concepts necessary to create coherent databases.
The approach and the vocabulary used here are intended for non-specialist readers, without removing the concepts essential to the design process: "Ce qui se conçoit bien s'énonce clairement, et les mots pour le dire arrivent aisément"... (Boileau)
The approach and tool (see below) are not 'toy' applications and are capable of building and managing real relational databases.
------------------------------------
Comments, in green, are specially dedicated to "database purists".
The design method, presented hereafter, is easy to use (by everybody) and operational. Easy doesn't mean it's simplistic. This method, like all relational database design methods, relies on the work of Codd (1970) and Date (1975) and follows the recommendations of the ANSI-SPARK workshop group (1975). Only the unnecessarily verbose aspect of the "traditional" methods has been intentionally withdrawn.
------------------------------------
IBEasy+, software developed by the author allows you to:
- Apply, practically, the database design stages presented in these pages
- Design and use "easily" relational databases
The reader, more interested in the formal aspects can see, in these pages, the item "Normalization of relations". See also the book of C. J. Date.
Fundamentals of Database
A database is a structured set of non-redundant information whose organization is governed by a data model.
The two key words that are used in defining a database are:
Data structuring (using the data model) and non-repetition (non-redundancy or minimal redundancy).
Building a database thus consists of grouping the data into "homogeneous" sets, the (entities, tables), each entity (table) being composed of a finite number of elementary data, the Attributes or Fields. The repetition (redundancy) of attributes must be minimal. The data model used in this presentation is the "entity-relationship" model.
Database Management System (DBMS)
A DBMS is software that serves as an interface between users and the Database.
A DBMS allows to describe, manipulate and query the data of a Database. It supports all issues related to concurrent access, data backup and recovery. It must also ensure the control, integrity and security of data.
Databases and DataBanks
A common mistake, not only done by beginners, is to call Databases any set of files or collection of disparate information. The definition of a Database corresponds to very specific criteria (cf. Above) and any set of unstructured information is not a Database. We then speak of a set of files or of databanks.
Database and spreadsheet
Another mistake, often made by novice users, is to confuse a database with a spreadsheet. The tables in a database are indeed presented in rows and columns, as in a spreadsheet, but the comparison stops there and the differences between the two approaches are important and numerous. The following table summarizes the main differences between a database and a spreadsheet.
Differences on... | Spreadsheet | Database |
Main use | Calculations | Data management and processing |
Structuring of data | None | Structuring and high coherency. |
Integrity Controls of data | None | Strict control of the potential values of each data. |
Access to data | Single user | Multi users |
Data confidentiality | No control | Verification of access privileges for each user. |
Data size | -One table -Some dozens of rows |
-Several tables -Several thousands of rows by table |
Processing of data | Quantitative | Qualitative and quantitative |
Interrogations of data | Realized by specific procedures | "Universal" language : SQL |
Databases design
The design of a Database, implies 3 fundamental steps:
- Analyzing documents: construction of the data dictionary.
- Structuring of the data dictionary: determination of entities and associations.
- Linking entities together: the diagram of data.
Problems with incorrect modeling
When modeling has not been done correctly, different problems may appear.