The Best Way to Diagnose Performance
Problems
When it comes to overall performance,
it's important to remember that the #1 contributor is always a good database
design. The second highest contributor to performance is well-tuned SQL code.
Some try and switch the two in priority, but this is a mistake because a bad
design has the potential to simply drown even the best-tuned SQL code (e.g. you
can't get index access in your SQL queries if the physical index design is
poorly done). But make no mistake - bad SQL code that's introduced into an
otherwise well-running database can make a real train wreck of things.
So how do you go about analyzing
database performance? There are three forms of performance analysis that are
used to troubleshoot and tune database systems:
1.
Bottleneck analysis - focuses on
answering the questions: What is my database server waiting on; what is a user
connection waiting on; what is a piece of SQL code waiting on?
2.
Workload analysis - examines the server
and who is logged on to determine the resource usage and activity of each.
3. Ratio-based
analysis - utilizes a number of rule-of-thumb ratios to gauge performance of a
database, user connection, or piece of code.
Of the three, bottleneck analysis is the
most efficient approach in terms of fast problem resolution. By determining
where the server, a user connection, or set of SQL code is spending its time,
you can then work to eliminate the discovered blockages, increase throughput,
and reduce overall response times. Unfortunately, this hasn't always been easy
in MySQL because the server hasn't traditionally exposed the type of wait-based
and resource statistics to quickly uncover bottlenecks.
But a better day has come along in MySQL
5.0.37 with the inclusion of the SQL Profiling utility. This interactive
diagnostic aid allows a developer or DBA to gain insight into where a set of
SQL code spends its time and the resources it is using. The best way to see the
power of this new utility is to walk through some examples of how it's used and
see the clear benefits it supplies, so let's do just that.
Getting started with the SQL Profiler
The SQL Profiler is built into the
database server and can be dynamically enabled/disabled via the MySQL client
utility. To begin profiling one or more SQL queries, simply issue the following
command:
mysql> set profiling=1;
Query OK, 0 rows affected (0.00 sec)
Two things happen once you issue this
command. First, any query you issue from this point on will be traced by the
server with various performance diagnostics being created and attached to each
distinct query. Second, a memory table named
Now,
simply execute a SQL query:profiling
is
created in the INFORMATION_SCHEMA database for your particular session (not
viewable by any other MySQL session) that stores all the SQL diagnostic
results. This table remains persistent until you disconnect from MySQL at which
point it is destroyed.mysql> select count(*) from client where broker_id=2;
+----------+
| count(*) |
+----------+
| 200 |
+----------+
1 row in set (0.00 sec)
Once the query completes, you can issue
the following command to view the SQL profiles that have currently been stored
for you:
mysql> show profiles;
+----------+------------+-----------------------------------------------+
| Query_ID | Duration | Query |
+----------+------------+-----------------------------------------------+
| 0 | 0.00007300 | set profiling=1 |
| 1 | 0.00044700 | select count(*) from client where broker_id=2 |
+----------+------------+-----------------------------------------------+
2 rows in set (0.00 sec)
You get a quick summary of all your
captured SQL plus the total duration that the query took to complete. To get the
same diagnostic info, you can also query the memory table that holds your
statistical information:
mysql> select sum(duration) from information_schema.profiling where query_id=1;
+---------------+
| sum(duration) |
+---------------+
| 0.000447 |
+---------------+
1 row in set (0.00 sec)
I'll show you why it's good to be able
to query the memory-based profiling table later in this article, but for now,
let's concentrate on other SHOW commands that you can use to get more detailed
diagnostic info about one or more queries that you've profiled. The most basic
command is one that lists the steps a profiled query went through to satisfy
your SQL request, along with each step's time:
mysql> show profile for query 1;
+--------------------+------------+
| Status | Duration |
+--------------------+------------+
| (initialization) | 0.00006300 |
| Opening tables | 0.00001400 |
| System lock | 0.00000600 |
| Table lock | 0.00001000 |
| init | 0.00002200 |
| optimizing | 0.00001100 |
| statistics | 0.00009300 |
| preparing | 0.00001700 |
| executing | 0.00000700 |
| Sending data | 0.00016800 |
| end | 0.00000700 |
| query end | 0.00000500 |
| freeing items | 0.00001200 |
| closing tables | 0.00000800 |
| logging slow query | 0.00000400 |
+--------------------+------------+
15 rows in set (0.00 sec)
You can also just issue
SHOW
PROFILE
and exclude the identification of a specific profile number to see the very
last profile you created.
Regardless of how you obtain the output,
you can see this data is extremely valuable in that it allows you to see where
your query spends its time during execution. This is done for all types of
queries, and not just
SELECT
statements as
this example shows:mysql> alter table t engine=myisam;
Query OK, 112050 rows affected (0.64 sec)
Records: 112050 Duplicates: 0 Warnings: 0
mysql> show profiles;
+----------+------------+-----------------------------------------------+
| Query_ID | Duration | Query |
+----------+------------+-----------------------------------------------+
| 0 | 0.00007300 | set profiling=1 |
| 1 | 0.00044700 | select count(*) from client where broker_id=2 |
| 2 | 0.00003400 | set profiling=0 |
| 3 | 0.00007400 | set profiling=1 |
| 4 | 0.63789700 | alter table t engine=myisam |
| 5 | 0.00004000 | set profiling=0 |
+----------+------------+-----------------------------------------------+
6 rows in set (0.00 sec)
mysql> show profile for query 4;
+----------------------+------------+
| Status | Duration |
+----------------------+------------+
| (initialization) | 0.00002900 |
| checking permissions | 0.00000800 |
| init | 0.00004000 |
| Opening table | 0.00009400 |
| System lock | 0.00000500 |
| Table lock | 0.00000700 |
| setup | 0.00004200 |
| creating table | 0.00195800 |
| After create | 0.00010900 |
| copy to tmp table | 0.52264500 |
| rename result table | 0.11289400 |
| end | 0.00004600 |
| query end | 0.00000700 |
| freeing items | 0.00001300 |
+----------------------+------------+
14 rows in set (0.00 sec)
So as you can see in the above profile,
the
ALTER
TABLE
statement spends the bulk of its time in the temporary table copy step. Armed
with this type of information, you have more insight into the hoops your query
is jumping through from start to finish, and therefore, you can then work to
tune your queries to help eliminate any identified bottlenecks.
There's more information than just
duration that you can get from your profiles - for example, CPU usage (which,
unfortunately, is not available on all platforms; the below comes from Linux):
mysql> show profile cpu for query 4;
+----------------------+------------+------------+------------+
| Status | Duration | CPU_user | CPU_system |
+----------------------+------------+------------+------------+
| (initialization) | 0.00002900 | 0.00000000 | 0.00000000 |
| checking permissions | 0.00000800 | 0.00000000 | 0.00000000 |
| init | 0.00004000 | 0.00000000 | 0.00000000 |
| Opening table | 0.00009400 | 0.00100000 | 0.00000000 |
| System lock | 0.00000500 | 0.00000000 | 0.00000000 |
| Table lock | 0.00000700 | 0.00000000 | 0.00000000 |
| setup | 0.00004200 | 0.00000000 | 0.00000000 |
| creating table | 0.00195800 | 0.00000000 | 0.00100000 |
| After create | 0.00010900 | 0.00000000 | 0.00000000 |
| copy to tmp table | 0.52264500 | 0.55591600 | 0.04199300 |
| rename result table | 0.11289400 | 0.00199900 | 0.00000000 |
| end | 0.00004600 | 0.00000000 | 0.00000000 |
| query end | 0.00000700 | 0.00000000 | 0.00000000 |
| freeing items | 0.00001300 | 0.00000000 | 0.00000000 |
+----------------------+------------+------------+------------+
14 rows in set (0.00 sec)
Other parameters that can be passed to the SHOW PROFILE
command include:
·
ALL - displays all information
·
BLOCK
IO - displays counts for block input and
output operations
·
CONTEXT
SWITCHES - displays counts for voluntary and
involuntary context switches
·
IPC - displays counts for messages sent and
received
·
MEMORY - is not currently implemented
·
PAGE
FAULTS - displays counts for major and minor page
faults
·
SOURCE - displays the names of functions from the
source code, together with the name and line number of the file in which the
function occurs
·
SWAPS - displays swap counts
Using the Profiler for Tuning Queries
Let's
take a look at a quick example of how the new profiler can be a help to tune an
inefficient query. I don't know about you, but I've always been just so-so at
being able to really use EXPLAIN
's to
troubleshoot queries. But the new profiler makes things pretty nice and easy.I have a MySQL database that's used to track investment activity for a small brokerage house. There's a view in the database that's used to get a quick report of client accounts over a million dollars:
mysql> select * from v_client_portfolio_high;
+-----------+-------------------+------------------+-----------------+
| client_id | client_first_name | client_last_name | portfolio_value |
+-----------+-------------------+------------------+-----------------+
| 5 | ABNER | ROSSELLETT | 1252115.50 |
| 500 | CANDICE | BARTLETT | 1384877.50 |
+-----------+-------------------+------------------+-----------------+
2 rows in set (3.73 sec)
Now
a query running under four seconds isn't necessarily a bad thing, but I wonder
if things could be made a little bit better. So let's profile the view and see
what we can discover:mysql> set profiling=1;
Query OK, 0 rows affected (0.00 sec)
mysql> select * from v_client_portfolio_high;
+-----------+-------------------+------------------+-----------------+
| client_id | client_first_name | client_last_name | portfolio_value |
+-----------+-------------------+------------------+-----------------+
| 5 | ABNER | ROSSELLETT | 1252115.50 |
| 500 | CANDICE | BARTLETT | 1384877.50 |
+-----------+-------------------+------------------+-----------------+
2 rows in set (4.01 sec)
mysql> set profiling=0;
Query OK, 0 rows affected (0.00 sec)
mysql> show profiles;
+----------+------------+-----------------------------------------------+
| Query_ID | Duration | Query |
+----------+------------+-----------------------------------------------+
| 0 | 0.00007300 | set profiling=1 |
| 1 | 0.00044700 | select count(*) from client where broker_id=2 |
| 2 | 0.00003400 | set profiling=0 |
| 3 | 0.00007400 | set profiling=1 |
| 4 | 0.63789700 | alter table t engine=myisam |
| 5 | 0.00004000 | set profiling=0 |
| 6 | 0.00007600 | set profiling=1 |
| 7 | 4.01965600 | select * from v_client_portfolio_high |
| 8 | 0.00003500 | set profiling=0 |
+----------+------------+-----------------------------------------------+
Now
at first, I'm tempted to issue a SHOW PROFILE
for
query 7, which is my view query, but instead let's see just how many lines of
diagnostic data the profiler has collected for me:mysql> select count(*) from information_schema.profiling where query_id=7;
+----------+
| count(*) |
+----------+
| 74734 |
+----------+
Hello! The profiler generated over
74,000 lines of output for me, which might be hard to wade through using the
normal
SHOW
PROFILE
command (even though a LIMIT option is provided). Instead, let's use a SQL
query against the profiling table to see what our query did:mysql> select min(seq) seq,state,count(*) numb_ops,
-> round(sum(duration),5) sum_dur, round(avg(duration),5) avg_dur,
-> round(sum(cpu_user),5) sum_cpu, round(avg(cpu_user),5) avg_cpu
-> from information_schema.profiling
-> where query_id = 7
-> group by state
-> order by seq;
+-------+----------------------+----------+---------+---------+---------+---------+
| seq | state | numb_ops | sum_dur | avg_dur | sum_cpu | avg_cpu |
+-------+----------------------+----------+---------+---------+---------+---------+
| 0 | (initialization) | 1 | 0.00004 | 0.00004 | 0.00000 | 0.00000 |
| 1 | Opening tables | 1 | 0.00023 | 0.00023 | 0.00000 | 0.00000 |
| 2 | System lock | 1 | 0.00001 | 0.00001 | 0.00000 | 0.00000 |
| 3 | Table lock | 1 | 0.00001 | 0.00001 | 0.00000 | 0.00000 |
| 4 | checking permissions | 1 | 0.00010 | 0.00010 | 0.00000 | 0.00000 |
| 5 | optimizing | 4 | 0.00004 | 0.00001 | 0.00000 | 0.00000 |
| 6 | statistics | 4 | 0.00007 | 0.00002 | 0.00100 | 0.00025 |
| 7 | preparing | 4 | 0.00005 | 0.00001 | 0.00000 | 0.00000 |
| 8 | Creating tmp table | 1 | 0.00003 | 0.00003 | 0.00000 | 0.00000 |
| 9 | executing | 37352 | 0.16631 | 0.00000 | 0.05899 | 0.00000 |
| 10 | Copying to tmp table | 1 | 0.00006 | 0.00006 | 0.00000 | 0.00000 |
| 15 | Sending data | 37353 | 3.85151 | 0.00010 | 3.72943 | 0.00010 |
| 74717 | Sorting result | 1 | 0.00112 | 0.00112 | 0.00100 | 0.00100 |
| 74719 | removing tmp table | 2 | 0.00003 | 0.00001 | 0.00000 | 0.00000 |
| 74721 | init | 1 | 0.00002 | 0.00002 | 0.00000 | 0.00000 |
| 74727 | end | 1 | 0.00001 | 0.00001 | 0.00000 | 0.00000 |
| 74728 | query end | 1 | 0.00000 | 0.00000 | 0.00000 | 0.00000 |
| 74729 | freeing items | 1 | 0.00002 | 0.00002 | 0.00000 | 0.00000 |
| 74730 | closing tables | 2 | 0.00001 | 0.00001 | 0.00000 | 0.00000 |
| 74733 | logging slow query | 1 | 0.00000 | 0.00000 | 0.00000 | 0.00000 |
+-------+----------------------+----------+---------+---------+---------+---------+
Looking at the diagnostics above, I can
see that my view query spends the bulk of its time in the
Sending
data
step, with that step and the executing
step
experiencing over 37,000 operations. Why would this be the case? Let's take a look
at the underlying view definition and see what the actual query looks like:create view v_client_portfolio_high
(
client_id,
client_first_name,
client_last_name,
portfolio_value
)
as
select b.client_id,
b.client_first_name,
b.client_last_name,
(select sum(number_of_units * price)
from client_transaction c
where c.action = 'buy' and
c.client_id = b.client_id) -
(select sum(number_of_units * price)
from client_transaction d
where d.action = 'sell' and
d.client_id = b.client_id) portfolio_value
from client_transaction a,
client b
where a.client_id = b.client_id
group by b.client_id,
b.client_first_name,
b.client_last_name
having portfolio_value > 1000000
The view does some normal column grabs
from the underlying base tables, but then uses some inline views to calculate a
client's overall portfolio value. The inline views/subselects are obviously
what's causing the large number of data sends and executions. Let's eliminate
just one of the subselects by pulling it up into the main query and if that
makes a difference. We'll drop and recreate the view (with a little clean up on
NULL values that may affect the computed outcome), profile the new view's
performance, and then examine the diagnostic output:
mysql> drop view v_client_portfolio_high;
Query OK, 0 rows affected (0.00 sec)
mysql> create view v_client_portfolio_high
-> (
-> client_id,
-> client_first_name,
-> client_last_name,
-> portfolio_value
-> )
-> as
-> select b.client_id,
-> b.client_first_name,
-> b.client_last_name,
-> sum(number_of_units * price) -
-> case
-> (select sum(number_of_units * price)
-> from client_transaction d
-> where d.action = 'sell' and
-> d.client_id = b.client_id)
-> when NULL then 0
-> else
-> (select sum(number_of_units * price)
-> from client_transaction d
-> where d.action = 'sell' and
-> d.client_id = b.client_id)
-> end portfolio_value
-> from client_transaction a,
-> client b
-> where a.client_id = b.client_id and
-> action = 'buy'
-> group by b.client_id,
-> b.client_first_name,
-> b.client_last_name
-> having portfolio_value > 1000000;
Query OK, 0 rows affected (0.00 sec)
mysql> set profiling=1;
Query OK, 0 rows affected (0.00 sec)
mysql> select * from v_client_portfolio_high;
+-----------+-------------------+------------------+-----------------+
| client_id | client_first_name | client_last_name | portfolio_value |
+-----------+-------------------+------------------+-----------------+
| 5 | ABNER | ROSSELLETT | 1252115.50 |
| 500 | CANDICE | BARTLETT | 1384877.50 |
+-----------+-------------------+------------------+-----------------+
2 rows in set (0.47 sec)
mysql> set profiling=0;
Query OK, 0 rows affected (0.00 sec)
mysql> show profiles;
+----------+------------+-----------------------------------------------+
| Query_ID | Duration | Query |
+----------+------------+-----------------------------------------------+
| 0 | 0.00007300 | set profiling=1 |
| 1 | 0.00044700 | select count(*) from client where broker_id=2 |
| 2 | 0.00003400 | set profiling=0 |
| 3 | 0.00007400 | set profiling=1 |
| 4 | 0.63789700 | alter table t engine=myisam |
| 5 | 0.00004000 | set profiling=0 |
| 6 | 0.00007600 | set profiling=1 |
| 7 | 4.01965600 | select * from v_client_portfolio_high |
| 8 | 0.00003500 | set profiling=0 |
| 9 | 0.00007500 | set profiling=1 |
| 10 | 0.45292700 | select * from v_client_portfolio_high |
| 11 | 0.00003800 | set profiling=0 |
+----------+------------+-----------------------------------------------+
12 rows in set (0.00 sec)
mysql> select min(seq) seq,state,count(*) numb_ops,
-> round(sum(duration),5) sum_dur, round(avg(duration),5) avg_dur,
-> round(sum(cpu_user),5) sum_cpu, round(avg(cpu_user),5) avg_cpu
-> from information_schema.profiling
-> where query_id = 10
-> group by state
-> order by seq;
+-------+----------------------+----------+---------+---------+---------+---------+
| seq | state | numb_ops | sum_dur | avg_dur | sum_cpu | avg_cpu |
+-------+----------------------+----------+---------+---------+---------+---------+
| 0 | (initialization) | 1 | 0.00004 | 0.00004 | 0.00000 | 0.00000 |
| 1 | Opening tables | 1 | 0.00024 | 0.00024 | 0.00000 | 0.00000 |
| 2 | System lock | 1 | 0.00001 | 0.00001 | 0.00000 | 0.00000 |
| 3 | Table lock | 1 | 0.00001 | 0.00001 | 0.00000 | 0.00000 |
| 4 | checking permissions | 1 | 0.00012 | 0.00012 | 0.00100 | 0.00100 |
| 5 | optimizing | 4 | 0.00005 | 0.00001 | 0.00000 | 0.00000 |
| 6 | statistics | 4 | 0.00007 | 0.00002 | 0.00000 | 0.00000 |
| 7 | preparing | 4 | 0.00005 | 0.00001 | 0.00000 | 0.00000 |
| 8 | Creating tmp table | 1 | 0.00003 | 0.00003 | 0.00000 | 0.00000 |
| 9 | executing | 11194 | 0.04983 | 0.00000 | 0.01800 | 0.00000 |
| 10 | Copying to tmp table | 1 | 0.00008 | 0.00008 | 0.00000 | 0.00000 |
| 15 | Sending data | 11195 | 0.39853 | 0.00004 | 0.36794 | 0.00003 |
| 22401 | Sorting result | 1 | 0.00375 | 0.00375 | 0.00400 | 0.00400 |
| 22403 | removing tmp table | 2 | 0.00005 | 0.00002 | 0.00000 | 0.00000 |
| 22405 | init | 1 | 0.00002 | 0.00002 | 0.00000 | 0.00000 |
| 22411 | end | 1 | 0.00001 | 0.00001 | 0.00000 | 0.00000 |
| 22412 | query end | 1 | 0.00000 | 0.00000 | 0.00000 | 0.00000 |
| 22413 | freeing items | 1 | 0.00002 | 0.00002 | 0.00000 | 0.00000 |
| 22414 | closing tables | 2 | 0.00001 | 0.00001 | 0.00000 | 0.00000 |
| 22417 | logging slow query | 1 | 0.00000 | 0.00000 | 0.00000 | 0.00000 |
+-------+----------------------+----------+---------+---------+---------+---------+
20 rows in set (0.44 sec)
Much
better! Although more tuning on the view can likely be done, this simple change
has knocked the Sending data and the executing step counts down by over
two-thirds, with the end result being an 88% reduction in overall response
time. The profiler was very useful in this case as this particular query shows
that even though an individual step in the SQL execution process may be handled
very quickly (the average duration column in the above query), having that same
step executed over and over again can put a major dent in total query performance.
By default, you can store 15
different query profiles for your session. You can increase this up to 100 by
setting the
profiling_history_size
session variable. Certain diagnostics rely on operating system support for the getrusage()
system call, so you may see NULL values for some statistics if you're on a
platform that doesn't support this function.
No comments:
Post a Comment