This presentation is not a comprehensive course in database design, but an initial approach to clarifying the process and concepts required to build consistent databases.
The approach and the vocabulary used here are intended for non-specialist readers, without removing the concepts essential to the design process: "Ce qui se conçoit bien s'énonce clairement, et les mots pour le dire arrivent aisément"... (Boileau)
The approach and the tool (see below) are not "toy" apps and allow you to create and operate real relational databases.
Comments, in green, are specially dedicated to "database purists".
The design method, presented hereafter, is easy to use (by everybody) and operationally. Easy doesn't mean it's simplistic. This method, like all relational database design methods, relies on the work of Codd (1970) and Date (1975) and follows the recommendations of the ANSI-SPARK workshop group (1975). Only the unnecessarily verbose aspect of the "traditional" methods has been intentionally withdrawn.
- Apply, practically, the database design stages presented in these pages
- Design and use "easily" relational databases
The reader, more interested in the formal aspects can see, in these pages, the item "Normalization of relations".
Fundamentals of Database
A database is a set of structured and non-redundant information, whose organization is based on a data model.
The two key words involved in defining a database are:
Structuring (by means of the data model) and no repetition (redundancy or minimal redundancy) of data.
To build a database consists of gathering data in "homogeneous" sets (entities, tables), each entity (table) being composed of a finished number of elementary data, Attributes or Fields, repetition (redundancy) of attributes being minimal. The data model used in this presentation is the entity-relationship model.
Database management system (DBMS)
A DBMS is a software which acts as an interface between users and the database.
A DBMS enables the description, management and retrieval of database data. A DBMS must address all simultaneous access, data protection and restore, control integrity and data security issues.
Databases versus Databanks
One common mistake, and not just for beginners, is to call databases any set of files or compile disparate information. The definition of a database corresponds to very accurate criteria (cf. above) and any set of unstructured information must not be called a database. We speak then rather of a set of files or a databank.
Database versus Spreadsheets
Another mistake, often made by novice users, is to confuse a database with a spreadsheet. The tables of a database are presented indeed in rows and columns, as in a spreadsheet, but the comparison stops there and dissimilarities between the two approaches are significant and numerous. The following table summarizes the major differences between a database and a spreadsheet.
|Main use||Calculations||Data management and processing|
|Structuring of data||None||Structuring and high coherency.|
|Integrity Controls of data||None||Strict control of the potential values of each data.|
|Access to data||Single user||Multi users|
|Data confidentiality||No control||Verification of access privileges for each user.|
|Data size||-One table
-Some dozens of rows
-Several thousands of rows by table
|Processing of data||Quantitative||Qualitative and quantitative|
|Interrogations of data||Realized by specific procedures||"Universal" language : SQL|
The design of a Database, implies 3 fundamental steps:
- Analyzing documents: construction of the data dictionary.
- Structuring of the data dictionary: determination of entities and associations.
- Linking entities together: the diagram of data.
Problems with incorrect modeling
When modeling has not been done correctly, different problems may appear.