Several weeks ago I asked on Twitter for SQL schema naming conventions from DBA professionals. (I'm always interested in the generally-accepted practices of related professions; when I can, I try to make my work as compatible with theirs as possible.)

I got back only a handful of responses, representing MySQL, PostgreSQL, and DB2 administrators, really not enough for a statistically useful sample. Even so, I'm going to present their anonymized responses here, because they led me to work I had not previously considered at length.

My questions were:

  1. For table names, do you prefer plural (posts), singular (post), or something else?

  2. For primary key column names, do you prefer plural (posts_id), singular (post_id), just plain id, or something else?

  3. How do you name many-to-many association tables? For example, if many posts relate to many tags, do you prefer combining the table names in plural or singular? If so, do you separate them with an underscore? The examples would be posts_tags for plural, and post_tag for singular. Or do you prefer another approach?

The answers follow.

Table Names

  • "Table and columns are singular, so create table item, account and not items, accounts."

  • "Keep names singular. The reason behind that is that it was easy to reference column name with table name. Example: "user".first_name. The biggest challenge going with singular name is that most of the popular table names are considered keywords for the databases. Some of the examples: user, order, name, type etc."

  • "Table names should be plural. That's how I learned it, and it seems to make sense that a name for a collection of rows should be plural."

  • "I prefer plural table names."

  • "Plural - because it is a set of things."

Primary Key Names

  • "Every table must have an id primary key (surrogate) using a sequence (identity is ok sometimes)."

  • "I prefer singular names for column without any prefix or suffix."

  • "I would have said post_id but for the past several years I've switched to just id."

  • "I prefer primary key always id."

  • "Singular. For example, UserID."

Association Table Names

  • "If I follow singular table names, I use post_tag_mapping. I like to use _mapping suffix to explicitly identify such tables."

  • "We use plural_plural."

  • "I prefer mapping tables singular."

  • "I combine them as SingularPlural and generally have the dominant entity first as it owns things in the second entity. Ex: PostTags or UserRoles or StudentTests."

What Does This Tell Us?

Not a whole lot, it seems. We might say "there's no generally accepted practice" but with only 5 respondents that's not a reliable conclusion.

Havig said that, one respondent summed up what seemed to be a common sentiment this way: "Most people will probably agree it's about agreeing on a standard, and then being consistent with it." I think that's often the case with standards.

Another respondent noted, "Once upon a time you had production DBAs, and development ones that could do data modelling. These days it's just production DBAs, and we always inherit designs as we come in later." That certainly squares with my own experience. DBA professionals are generally hired much later as the business matures, and they're stuck with whatever non-DBA-professional decisions were made before their arrived. The pre-existing schemas bind their hands.

What Would Joe Celko Do (WWJCD) ?

However, more than one respondent referred to Joe Celko's SQL Programming Style, which I immediately ordered and read through.

I thought Celko's recommendations made a lot of sense. At first I thought I would have to copy the relevant sections here, but it turns out that Simon Holywell has already done so at his SQL Style Guide.

Celko's answers to the above questions appear to be:

  1. For tables: "Use a collective name or, less ideally, a plural form. For example (in order of preference) staff and employees." This one was especially interesting to me. The idea of using a collective name, not merely a plural name, makes a lot of sense to me, though it does not lend itself to automation.

  2. For primary key names: "Where possible avoid simply using id as the primary identifier for the table." I gather from other reading that the recommendation is to use a natural identifier as a prefix; in the case of a posts table, that would be post_id.

  3. For association tables: "Avoid, where possible, concatenating two table names together to create the name of a relationship table. Rather than cars_mechanics prefer services." On seeing it this way, it also makes sense to me, and I do not recall seeing it stated that way before.

Further, Celko lays out a series of uniform suffixes for column names. That by itself is pretty interesting.

Conclusion

If you're starting a project from scratch, and are interested in following the advice of at least one SQL and DBA professional giant, you may wish to review the recommendations at http://www.sqlstyle.guide and try them out. Even better, buy Celko's book. At the very least, by reading those recommendations, you'll have gained a greater range of options to choose from.

UPDATE: If it was not clear from the introduction, this exercise was about discovering generally-accepted practices of DBA/SQL professionals (i.e., people whose primary job is to administer a database and write SQL schemas), not the preferences of application developers who happen to use SQL databases.


Are you stuck with a legacy PHP application? You should buy my book because it gives you a step-by-step guide to improving you codebase, all while keeping it running the whole time.