Saturday, May 26, 2007

Why not MySQL

1. MySQL is not scalable: there is no table partitioning. As your data grows and so does your use of the database, you'll find your options for scalability are very limited to what you can "hack" around on your own. Other RDBMS solutions like Oracle, MSSQL, and yes, even Postgres have anywhere from decent to excellent scalability.

2. Advertized "enterprise" features are hacked into the stagnant and very monolithic MySQL codebase and frequently do not deliver as advertized. Here are examples: Sub-selects are not well optimized, if at all; indexes cannot be used more than once in the same query to help with optimization; using a combination of triggers and stored procedures within transactions in a medium to high usage environment results in crashes that even MySQL cannot explain; row-level locking only exists if you only have primary key, and no other indexed columns that you need to update, and you are updating using that primary key; does not truly support MVCC — even with InnoDB your selects may block updates and the other way around; replication forces further limitations in concurrency; the list can go on and on. None of these are a problem to any noticeable extent with any of the other "enterprise" RDBMSes.

3. More on replication: even though MySQL has somewhat elegant solution for replication, besides limiting concurrency (as already mentioned) and introducing serialization, this solution poses additional tricky fundamental problems. For example, it is nearly impossible to implement a true multi-master replicated environment.

4. No HA solution: Oracle and MSSQL (recently and more limited) offer true HA solutions that can increase your database availability in case of failure, and within the HA environment guarantee the transactions and data the applications were led to believe was successfully manipulated. This cannot be achieved with heartbeat and replication using MySQL, or even Postgres for that matter.

5. Critical bugs dealing with data consistency: This is not a statistical analysis, but MySQL has had and still has a lot of critical bugs dealing with critical part of the RDBMS - data. e.g., you cannot rely on RDBMS for storing your data if, when queried under certain circumstances, it returns NULLs when it should return the correct data. It is not fully comprehensible how a product released as stable (such as version 5.0) can still have so many critical data-related bugs.

6. Horrible codebase: If you are at least a decent programmer, please have a look at MySQL code: monolithic, one main file with succession of countless if blocks for parsing, optimizing and running queries; features such as triggers, stored procedures, and replication visibly hacked in to the existing "bad" design. There's very little abstraction that can leave data files in inconsistent and unreadable state in the event of the server crash (mostly MyISAM). And then, just for kicks, please have a look at Postgres source code: well-organized, separated into well-designed components you'll get acquainted with certain satisfaction to components that do parsing, planning, optimization, execution, and other functions. Code is well-commented and, as a programmer, it will give you a certain comfort when dealing with the software. This is a very important point and demonstrates why Postgres, for example, having a solid foundation, can implement advanced features (such as transaction savepoints, etc.) with very little critical issues, and why MySQL has half of the features that only "work" in certain circumstances and are subject to critical bugs after the stable software release. If you don't see a night and day difference between the two, smack me.

7. No RBAC.