Databases Design

Databases design

This presentation is not a comprehensive course on database design, but rather a preliminary approach to clarifying the processes and concepts necessary to create coherent databases.

The approach and the vocabulary used here are intended for non-specialist readers, without removing the concepts essential to the design process: "Ce qui se conçoit bien s'énonce clairement, et les mots pour le dire arrivent aisément"... (Boileau)

The approach and tool (see below) are not 'toy' applications and are capable of building and managing real relational databases.

------------------------------------
Comments, in green, are specially dedicated to "database purists".
The design method, presented hereafter, is easy to use (by everybody) and operational. Easy doesn't mean it's simplistic. This method, like all relational database design methods, relies on the work of Codd (1970) and Date (1975) and follows the recommendations of the ANSI-SPARK workshop group (1975). Only the unnecessarily verbose aspect of the "traditional" methods has been intentionally withdrawn.
------------------------------------

IBEasy+, software developed by the author allows you to:

  1. Apply, practically, the database design stages presented in these pages
  2. Design and use "easily" relational databases

The reader, more interested in the formal aspects can see, in these pages, the item "Normalization of relations". See also the book of C. J. Date.

Fundamentals of Database

A database is a structured set of non-redundant information whose organization is governed by a data model.

The two key words that are used in defining a database are:
Data structuring (using the data model) and non-repetition (non-redundancy or minimal redundancy).
Building a database thus consists of grouping the data into "homogeneous" sets, the (entities, tables), each entity (table) being composed of a finite number of elementary data, the Attributes or Fields. The repetition (redundancy) of attributes must be minimal. The data model used in this presentation is the "entity-relationship" model.

Database Management System (DBMS)

A DBMS is software that serves as an interface between users and the Database.
A DBMS allows to describe, manipulate and query the data of a Database. It supports all issues related to concurrent access, data backup and recovery. It must also ensure the control, integrity and security of data.

Databases and DataBanks

A common mistake, not only done by beginners, is to call Databases any set of files or collection of disparate information. The definition of a Database corresponds to very specific criteria (cf. Above) and any set of unstructured information is not a Database. We then speak of a set of files or of databanks.

Database and spreadsheet

Another mistake, often made by novice users, is to confuse a database with a spreadsheet. The tables in a database are indeed presented in rows and columns, as in a spreadsheet, but the comparison stops there and the differences between the two approaches are important and numerous. The following table summarizes the main differences between a database and a spreadsheet.

Differences on... Spreadsheet Database
Main use Calculations Data management and processing
Structuring of data None Structuring and high coherency.
Integrity Controls of data None Strict control of the potential values of each data.
Access to data Single user Multi users
Data confidentiality No control Verification of access privileges for each user.
Data size -One table
-Some dozens of rows
-Several tables
-Several thousands of rows by table
Processing of data Quantitative Qualitative and quantitative
Interrogations of data Realized by specific procedures "Universal" language : SQL
Databases design

The design of a Database, implies 3 fundamental steps:

  1. Analyzing documents: construction of the data dictionary.
  2. Structuring of the data dictionary: determination of entities and associations.
  3. Linking entities together: the diagram of data. 
Problems with incorrect modeling

When modeling has not been done correctly, different problems may appear.