MySQL vs. PostgreSQL
Posted by Daniel Lyons Tue, 19 Dec 2006 06:33:15 GMT
A lot has been made of MySQL and PostgreSQL lately. Now, just to be bitchy, I thought I might chip in my two cents about the deeper problems.
One thing which consistently annoys me about MySQL is that it is a very leaky abstraction. PostgreSQL has three integers, defined in the SQL spec. MySQL has six, and the UNSIGNED modifier for 12, plus a ZEROFILL option. To me, and most database people, the whole point of databases is to abstract away complexity. MySQL encourages you to worry about low-level details.
Other features of MySQL seem to be added for convenience to C programmers or UNIX people. For example, it handles bash-style # comments, for no particular reason. UNSIGNED also strikes me as the C level bubbling on up to interfere with my life at the DBA level.
But it also reveals a general MySQL propensity to modify the grammar. For example, in SQL, there is no NULL modifier. But the MySQL guys added it for parity with NOT NULL. NULL is the default, so PostgreSQL doesn’t provide it. There are other bizarre additions to the syntax ostensibly for convenience, such as REPLACE INTO. Other aspects of the grammar feel very haphazard in MySQL. Compare MySQL’s INSERT syntax:
INSERT [LOW_PRIORITY | DELAYED | HIGH_PRIORITY] [IGNORE]
[INTO] tbl_name [(col_name,...)]
VALUES ({expr | DEFAULT},...),(...),...
[ ON DUPLICATE KEY UPDATE col_name=expr, ... ]
or
INSERT [LOW_PRIORITY | DELAYED | HIGH_PRIORITY] [IGNORE]
[INTO] tbl_name
SET col_name={expr | DEFAULT}, ...
[ ON DUPLICATE KEY UPDATE col_name=expr, ... ]
or
INSERT [LOW_PRIORITY | HIGH_PRIORITY] [IGNORE]
[INTO] tbl_name [(col_name,...)]
SELECT ...
[ ON DUPLICATE KEY UPDATE col_name=expr, ... ]
To PostgreSQL’s:
INSERT INTO table [ ( column [, ...] ) ]
{ DEFAULT VALUES | VALUES ( { expression | DEFAULT } [, ...] ) [, ...] | query }
[ RETURNING * | output_expression [ AS output_name ] [, ...] ]
It may amuse you to know that I picked INSERT arbitrarily, because I didn’t want to show all the weird crap going on in the CREATE statement. Note that in the PostgreSQL syntax above, the ”| query” has all the same expressive power as the SELECT portion of th e MySQL syntax. But also look at what isn’t there in the PostgreSQL syntax: ON DUPLICATE KEY, and the whole wacky priority thing. MySQL is full of strange things like that. All of MySQL seems to suffer from this, to me, outrageous level of complexity.
Another thing that gets me is the whole MyISAM / InnoDB difference. Supposing the difference weren’t being used to cheat at benchmarks, it’d still be annoying to have to decide whether one wants data integrity or speed for a whole collection of data. But the fact is, it’s also being used to cheat in comparisons to PostgreSQL, by comparing PostgreSQL’s speed to MyISAM and feature set to InnoDB. Either way, it’s the abstraction leaking up to the surface leading me to ask the client questions like, “So… how much data do you think you’re going to have to store?”
Then there are other features of MySQL that are just plain weird. For example, the TIMESTAMP type changes behind your back to the last time a row is updated, even if you don’t modify the column containing the TIMESTAMP. They offer a DATETIME that doesn’t do this but contains the same amount of information. They say, if you don’t want this to happen on a particular UPDATE, you can set the column to … itself. This is all very fishy to someone accustomed to databases with triggers. Another example: if you use a function in the DEFAULT of a column definiton, MySQL executes it once, when you create the table. PostgreSQL executes it when you insert a row without that column defined. Again, fishy.
Anyway, I recommend people use PostgreSQL. Getting used to it is a little weird at first but, like most well-engineered things, it becomes clear over time how it all fits together. One never has that experience with MySQL.
