Fast MySQL InnoDB count. Really fast
Last night I was pairing with John on a feature for an upcoming release. I wanted to count the number of rows in a table so we could run analytics and track performance. I <3 metrics, ya know?
"Tim, this is simple. Why are you writing about this?"
And it is simple. If this were MyISAM. See, MyISAM always stores the number of rows on the table header. So, whenever we ask "how many rows are there?", it can just grab the count and return it. Not InnoDB.
In InnoDB (for internal reasons), the number of rows has to be counted. Every single time. One of the tables we were counting was over 1.2 million rows. On a small EC2 instance with no other queries/major processes, this takes 1 minute and 20 seconds. This is unacceptable.
We looked at another command that will estimate the number of rows, but according to MySQL, it can be off by up to 50%
Being off by up to 50% is unacceptable, even if it is fast. So we kept looking. Most of the recommendations were "make a table that stores the count, and add to/subtract from it every time you do a write/delete from the actual table". Also not helpful. We've got over a million rows, and this is a prototype that's only been online for a few weeks. I refuse to double the number of writes to our DB.
John points out that since our table is so huge, we've already got indices on it. Why not make the query use one of them?
Win. The trick is to hint it at a specific index. So, if you're getting poor response times from InnoDB count, it means you've got lots of rows that have to be counted one at a time. And since you've got a lot of rows, you have at least one index. Just pick an index that will never contain a NULL value, and tell MySQL to 'use index (index_name)'. Also, thanks to Wallace, who commented on the MySQL Performance Blog