Liberty With Open Source: August 2010

The Primary Key Defined
The primary key is pretty easy to understand. It is simply the column or columns in a table that must be unique. In our example of the school schedule the database programmer is going to create two reference tables that have single-column primary keys and just a few rows each. Here they are, the table of schedule types and the table of periods:

SCHEDULE_TYPE | DESCRIPTION
------------------+--------------------------------
NORMAL | Normal Schedule
ALONG | Long Assembly
ASHORT | Short Assembly
ALUNCH | Short Assembly After Lunch
PERIOD | DESCRIPTION
-------------+-----------------------------

HR | Homeroom
1 | Period 1
2 | Period 2
3 | Period 3
4 | Period 4
LUNCH | Lunch
6 | Period 6
7 | Period 7
8 | Period 8
ASSM | Assembly
The first table has a primary key of SCHEDULE_TYPE. No two rows in that table can have the same value of SCHEDULE_TYPE. The second table has a primry key of PERIOD, so that no two rows in that table can have the same value of PERIOD.

A primary key is in fact very simple and does not really require a lot of discussion except for one further point: a primary key identifies something. For example, the value 'NORMAL' identifies a certain kind of schedule, and if I am given this value I can look up things about the schedule in the SCHEDULE_TYPES table.

Code and Data Decisions

These examples are so far very simple, so it may not be obvious that the tables SCHEDULE_TYPES and PERIODS are really necessary. After all, we can just put them into an associative array in some type of include file and not bother making tables, right?

Technically, you can of course put anything you want into an include file and save yourself some time at the front of a project. The drawbacks emerge over time and are the consequence of "trapping" data inside of code.

The first and most obvious drawback is that if the customer wants to change a description then a programmer must change a program file, instead of just having a user go to an admin screen. But worse, what if our program becomes a smash hit and schools all over the country are using it? What if they all want different descriptions, or what if they have different numbers of periods and kinds of schedules? If we start off by putting such things into tables then we will be well prepared for future demands, but if we "trap" the data in code then future success will produce choking demands on the programmers to change the files.

If the tables were more complicated we would have another issue. Databases were designed for efficient querying, but arrays were not. Querying the data that is trapped in code is much harder than querying tables in the database.

So we will take it as a given that data belongs in a database, even for simple two-column tables.

Liberty With Open Source

Wednesday, August 25, 2010

Database - Primary Key Specifications.