Optimizing MySQL Queries with Spring’s JdbcTemplate
Last week I migrated Stack Hunter’s back-end from MongoDB to MySQL. (There was no problem MongoDB, it’s just easier for most users to connect to their existing MySQL database than install something new.) In the process I spent some time optimizing the new database queries and indexes to make things snappy. I started by loading up 100,000 exceptions into the database, then went to work.
Query Execution Plan
MySQL has a great tool for understanding your queries. Just prefix your SQL statement with “EXPLAIN” and it spits out a bunch of useful information.
EXPLAIN tells you which indexes MySQL considered for your query, which index it choose if any, whether it had to use a temporary table, and so on. You can get a complete list and description of the output fields in MySQL’s Reference Manual (https://dev.mysql.com/doc/refman/5.5/en/explain-output.html). You can also check out this old, but informative slide on the topic (http://www.slideshare.net/phpcodemonkey/mysql-explain-explained).
MySQL Workbench
One way to use EXPLAIN is by adding it to your queries in MySQL Workbench’s SQL editor. This approach works very well for running quick tests or if you have a small number of queries. In my case Stack Hunter has a couple dozen select statements implemented as JDBC prepared statements.
If I was going to use EXPLAIN, I’d have to copy the queries from my Java classes to the SQL editor, replace the question mark placeholders with literal values, then hope I didn’t screw anything up. As you can see, the workbench approach doesn’t work well for a large set of prepared statements. It’s just too manual and error prone. It’s also not something that can be reused easily with new queries or on future projects.
Instrumenting JdbcTemplate
Unless you’re doing something fairly specialized, no one should be using the JDBC APIs on their own today. At a minimum, you should be using a framework like Spring’s JdbcTemplate that handles all the boilerplate resource management code.
1 2 3 4 5 6 7 8 9 |
@Override public List<ExceptionEntry> getExceptions(long appId, int startDay, int endDay, int offset, int count) { return queryForList("SELECT * FROM sh_exception " + "WHERE app_id=? " + "AND created_on_day BETWEEN ? AND ? " + "LIMIT ?, ?", new ExceptionRowMapper(), appId, startDay, endDay, offset, count); } |
Since all of Stack Hunter’s SQL statements run through JdbcTemplate, that’s a natural place to call EXPLAIN on slow running queries. By wrapping JdbcOperations (JdbcTemplate’s interface) with a dynamic proxy, we’re able to intercept each query along with any arguments.
1 |
this.db = ExplainInvocationHandler.wrap(new JdbcTemplate(ds)); |
The wrapper allowed me to run the original query unchanged, then run it again with the “EXPLAIN ” prefix using the supplied arguments. Initially I ran EXPLAIN on all queries, but later I set it to only run for statements that took more than 10 milliseconds. This conditional execution is hard-coded, but you can make it configurable in your own app.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
@Override public Object invoke(Object proxy, Method method, Object[] args) throws Throwable { long start = System.currentTimeMillis(); try { return method.invoke(db, args); } catch (InvocationTargetException e) { throw e.getCause(); } catch (Throwable e) { throw e; } finally { try { long duration = System.currentTimeMillis() - start; if (duration >= EXPLAIN_THRESHOLD) { explain(method, args, duration); } } catch (Throwable e) { log.warn(e, e); } } } |
Explaining Queries
With the hooks in place, the explain method retrieves each query’s execution plan and logs it for later analysis. I prefer outputting tab-separated-values (TSV) since they can be pasted into Excel without any extra work.
Once I had the data in Excel it wasn’t hard to see where Stack Hunter needed indexes or improved queries. Fortunately, no structural changes were needed, but that could also have been an option.
Your database is going to be different, but keep any eye out for these basic optimization candidates.
Field | Value | Meaning |
---|---|---|
Key | Null | No appropriate index was found to speed up your query. |
Extra | Using temporary | A temporary table was created to hold your results. |
Extra | Using filesort | Sorting could not be done using the index alone. |
Joins are also a good place to improve performance. Coming from a NoSQL, document database, Stack Hunter’s queries were all simple, no-join selects. If you are using joins, you’ll want to run them all through EXPLAIN.
The Final Results
Outside of the text-based searches, all queries are now running in the tens of milliseconds or less on 100k exceptions. This is more than adequate for the downloadable version of Stack Hunter. Of course, I’ll need to test into the millions for the SaaS version, but that’s for another time.
Good guide, Dele. I have to admit that one of my main web projects definitely need MySQL optimization, I got a feeling that this will lead to huge speed improvement.
Thanks and glad I could help 🙂