SQL Schema Naming Conventions
Several weeks ago I asked on Twitter for SQL schema naming conventions from DBA professionals. (I'm always interested in the generally-accepted practices of related professions; when I can, I try to make my work as compatible with theirs as possible.)
I got back only a handful of responses, representing MySQL, PostgreSQL, and DB2 administrators, really not enough for a statistically useful sample. Even so, I'm going to present their anonymized responses here, because they led me to work I had not previously considered at length.
My questions were:
-
For table names, do you prefer plural (
posts
), singular (post
), or something else? -
For primary key column names, do you prefer plural (
posts_id
), singular (post_id
), just plainid
, or something else? -
How do you name many-to-many association tables? For example, if many
posts
relate to manytags
, do you prefer combining the table names in plural or singular? If so, do you separate them with an underscore? The examples would beposts_tags
for plural, andpost_tag
for singular. Or do you prefer another approach?
The answers follow.
Table Names
-
"Table and columns are singular, so create table
item
,account
and notitems
,accounts
." -
"Keep names singular. The reason behind that is that it was easy to reference column name with table name. Example:
"user".first_name
. The biggest challenge going with singular name is that most of the popular table names are considered keywords for the databases. Some of the examples:user
,order
,name
,type
etc." -
"Table names should be plural. That's how I learned it, and it seems to make sense that a name for a collection of rows should be plural."
-
"I prefer plural table names."
-
"Plural - because it is a set of things."
Primary Key Names
-
"Every table must have an
id
primary key (surrogate) using a sequence (identity is ok sometimes)." -
"I prefer singular names for column without any prefix or suffix."
-
"I would have said
post_id
but for the past several years I've switched to justid
." -
"I prefer primary key always
id
." -
"Singular. For example,
UserID
."
Association Table Names
-
"If I follow singular table names, I use
post_tag_mapping
. I like to use_mapping
suffix to explicitly identify such tables." -
"We use
plural_plural
." -
"I prefer mapping tables singular."
-
"I combine them as
SingularPlural
and generally have the dominant entity first as it owns things in the second entity. Ex:PostTags
orUserRoles
orStudentTests
."
What Does This Tell Us?
Not a whole lot, it seems. We might say "there's no generally accepted practice" but with only 5 respondents that's not a reliable conclusion.
Havig said that, one respondent summed up what seemed to be a common sentiment this way: "Most people will probably agree it's about agreeing on a standard, and then being consistent with it." I think that's often the case with standards.
Another respondent noted, "Once upon a time you had production DBAs, and development ones that could do data modelling. These days it's just production DBAs, and we always inherit designs as we come in later." That certainly squares with my own experience. DBA professionals are generally hired much later as the business matures, and they're stuck with whatever non-DBA-professional decisions were made before their arrived. The pre-existing schemas bind their hands.
What Would Joe Celko Do (WWJCD) ?
However, more than one respondent referred to Joe Celko's SQL Programming Style, which I immediately ordered and read through.
I thought Celko's recommendations made a lot of sense. At first I thought I would have to copy the relevant sections here, but it turns out that Simon Holywell has already done so at his SQL Style Guide.
Celko's answers to the above questions appear to be:
-
For tables: "Use a collective name or, less ideally, a plural form. For example (in order of preference)
staff
andemployees
." This one was especially interesting to me. The idea of using a collective name, not merely a plural name, makes a lot of sense to me, though it does not lend itself to automation. -
For primary key names: "Where possible avoid simply using
id
as the primary identifier for the table." I gather from other reading that the recommendation is to use a natural identifier as a prefix; in the case of aposts
table, that would bepost_id
. -
For association tables: "Avoid, where possible, concatenating two table names together to create the name of a relationship table. Rather than
cars_mechanics
preferservices
." On seeing it this way, it also makes sense to me, and I do not recall seeing it stated that way before.
Further, Celko lays out a series of uniform suffixes for column names. That by itself is pretty interesting.
Conclusion
If you're starting a project from scratch, and are interested in following the advice of at least one SQL and DBA professional giant, you may wish to review the recommendations at http://www.sqlstyle.guide and try them out. Even better, buy Celko's book. At the very least, by reading those recommendations, you'll have gained a greater range of options to choose from.
UPDATE: If it was not clear from the introduction, this exercise was about discovering generally-accepted practices of DBA/SQL professionals (i.e., people whose primary job is to administer a database and write SQL schemas), not the preferences of application developers who happen to use SQL databases.
Read the Reddit discussions about this post hereand here.