Master of Query Performance
Add Query Analyzer, SQL Profiler, Performance Monitor and good query-writing
habits to your bag of tricks, and you may never have to worry about performance
complaints again.
- By Anil Desai
- March 01, 2001
Performance
optimization is an important part of any IT pro's career. It's simply
not good enough that your servers remain available and data backed up.
Those are the prerequisites for a smoothly running environment. What's
just as important is understanding performance issues related to your
database servers. You should be able to pinpoint performance problems
and find useful solutions.
If you've
worked with SQL Server, you know that Microsoft has provided a great set
of administrative tools. However, it's generally up to you — the
database developer or administrator — to make the best of them. In
this article, I'll discuss the sometimes-mysterious aspect of troubleshooting
database-related performance.
The full
topic of database performance optimization can (and has) filled many books.
It's outside the scope of this article to dive into the details of every
aspect of the topic. Instead, I'll focus on an overview of the various
tools and technologies available for measuring and optimizing SQL Server-related
performance issues. My goal is to provide a high-level overview of the
tools and techniques at your disposal.
For further
information and step-by-step details for various sections of this article,
be sure to consult the extremely useful SQL Server 2000 Books Online (installed,
by default, with SQL Server 2000). This article is intended for those
who work with SQL Server and know the basics. I'll focus on SQL Server
2000, but the vast majority of information in this article will pertain
to SQL Server 7.0, as well.
Understanding Database Performance Issues
Microsoft's SQL Server (and all modern relational database systems)
use several features to ensure that queries can be processed as quickly
as possible. There are two main goals: response time (generally, the amount
of time that it takes to return the first few records) and throughput
(the time it takes to return the entire result set).
The specific
requirements for real-world applications will vary. For example, if you're
supporting an online e-commerce site, it might be critical for users to
be able to view up-to-the-minute pricing and availability information
for products. Should they choose to buy something, the "checkout" process
must be quick. Contrast that scenario with a decision-support system.
Suppose your executive management team wants to compare numbers for Q1
and Q2 of this year. In that case, accuracy of information may be the
most important goal, and the extra time that it takes to generate the
requested data will be well spent.
The first
and, arguably, most important step in performance optimization is to understand
both your application and the business problems it's trying to solve.
You should know whether a 200-millisecond access time is reasonable and
how long your users are willing to wait to receive the information they
requested. Armed with this information, it's time to look at how you can
optimize the performance of your own applications. Let's start by discussing
the extremely important topic of SQL joins.
Optimizing SQL Joins
Although the simplest SQL queries might retrieve information from
only one table, most real-world queries combine information from multiple
tables to get what they need. For example, you might want a list of customers
(stored in the Customer table) and all of the orders they've placed in
the last two weeks (stored in the Orders table). The solution is to use
SQL joins — something any database developer is familiar with. However,
although joins are extremely important for getting the information your
users need, they can be the Achilles' heel of an otherwise well-designed
system.
When performing
queries that use multiple tables, you must always join all of the tables
to each other to preserve the relationships between them. If you don't,
you may get a result that's correct (or, worse yet, just seems correct)
but takes a long time to process. Also, a typical mistake is to remove
duplicates through the use of a SELECT DISTINCT clause (which makes the
problem worse). To illustrate this, consider the case of three tables:
- Customer
(350 rows)
- Employee
(50 rows)
- Order
(500 rows)
If you were
to run a query such as the following:
SELECT e.FirstName, e.LastName
FROM Customer c, Employee e, Order o
WHERE e.FirstName = 'Anil'
Since you
haven't explicitly joined all of the tables, a Cartesian product will
result. A Cartesian join (or "Cross Product") is the term used to describe
every possible combination of rows that could result. In this case, that
number would be at least 8,750,000 rows! If I added another table with
just 10 rows, the product would be over 87 million rows! This would suck
up a lot of time and system resources (RAM, CPU and disk space) to process.
The correct way to write this would be:
SELECT e.FirstName, e.LastName
FROM Customer c, Employee e, Order o
WHERE e.FirstName = 'Anil'
AND c.ID = o.CustomerID
AND o.SalesPersonID = e.id
Note that
all of the tables are joined together in this example. That will dramatically
filter the result set during processing and allow SQL Server quickly to
return the results you requested. In other words, users get their data
quickly, and SQL Server doesn't break a sweat!
Measuring Server, Database and
Query Performance
Since performance monitoring and optimization are important concerns
in real-world applications, Microsoft has included several tools that
can be used to measure database performance. In this section, we'll cover
the most important ones included with SQL Server 2000. Keep in mind that
each method of performance monitoring focuses on a different aspect of
system performance (with some overlap). However, information from all
of the tools can be useful in pinpointing performance problems.
Measuring Server Statistics with Performance
Monitor
An old favorite of experienced Windows NT/2000 systems administrators
is the Performance tool, which gives you a great overall collection of
information about how your existing server resources are being used. Points
of interest might include memory use, CPU use, network usage information
and, of course, statistics related to SQL Server.
Figure 1
shows an example of some standard SQL Server 2000 performance counters
and objects, many of which may be of interest to database administrators
and developers. For example, you might want to start by viewing overall
CPU and memory usage. Is the number of pages/second high? If so, perhaps
you're running some memory-intensive queries on the server. This is often
the case for large databases, which have queries that return large result
sets. Or, if you're seeing long, sustained CPU spikes, it might be time
to look at upgrading processors (or adding more of them, if your hardware
supports SMP).
|
Figure
1. Viewing server performance information in the Windows 2000
Performance tool. (Click image to view larger version.) |
Fortunately,
when you're adding a performance-related counter to the performance charts
and reports, you can click on the Explain button to get some additional
information about what you're measuring. And, if that's not enough, you
can find additional details within SQL Books Online. Now that you have
a method for collecting high-level performance data, let's drill down
one step.
Monitoring Queries with SQL Profiler
Regardless of how much planning and testing you perform, the best
test of any application is how it's used in the real world. The SQL Profiler
tool was designed to provide you with just that type of information. You
can monitor any or all queries that are executing on your database servers,
including dozens of parameters for each operation executed by SQL Server.
For example, you can measure how many CPU cycles and IO operations a specific
query required. Or, you can find out which user called a long-running
query. Furthermore, you can filter the results to those items of interest.
Perhaps you know that Monica in Accounting has a tendency to write some
extremely long-running ad-hoc queries. In that case, you can simply monitor
the queries that she's running to have some hard evidence of this!
SQL Profiler
was designed to be an easy and intuitive tool (both are good requirements
for any product that systems administrators are expected to use). In order
to use SQL Profiler, you begin by first designing a trace definition,
which defines exactly what you want to monitor. Figure 2 shows the options
at your disposal for storing collected information. Note that the data
can be stored to a trace file (generally faster, but more difficult to
analyze) or to a database table (which allows much more powerful analysis
through SQL queries, but requires more system resources to store information).
For ease of administration, you can develop trace templates for creating
new traces in the future. If you choose to save your trace information,
you'll be able to take advantage of SQL Server's Index Tuning Wizard (described
later in this article).
|
Figure
2. Viewing the General settings for a new Trace file. |
As you might
have already guessed, running Profiler does come at a cost. Although in
some cases the performance effects of logging queries might be negligible,
on larger systems the performance impact could be huge. Therefore, you
want to make sure to collect only the data of interest. For example, suppose
you want to isolate poorly performing queries. You can tell Profiler to
record information about queries that take longer than 15 seconds to complete.
Figure 3 shows the options you can enable to specify exactly what information
is recorded.
|
Figure
3. Choosing filters to restrict the types of events that are recorded.
|
Once you've
told Profiler what information you want to record, you're ready to begin
the trace. Figure 4 shows a sample of the types of information you might
choose to collect. In this case, I chose to record all queries that take
more than a second to execute. I'm saving the text of the queries, how
long they took to run and other server performance statistics for the
operation.
|
Figure
4. Recording Query Information using SQL Profiler. (Click image
to view larger version.) |
SQL Server Profiler also lets you replay the trace data you've collected.
This can be useful in determining the effects of any changes you might
have made. A typical scenario might begin when a user complains of slow
performance. To isolate and troubleshoot the problem, you might decide
to create a trace of the specific queries being run. Then you go into
the system and make changes to some queries. In order to ensure that overall
performance has improved, you can rerun the operations you recorded and
measure the time it takes on the server.
Thus far,
we've looked at server- and database-level monitoring. Now, let's drill-down
to the level of the SQL query itself.
Optimizing Query Logic using SQL Query Analyzer
Any developer who has worked with SQL Server 2000 is familiar with
the Query Analyzer tool. Query Analyzer goes far beyond just color-coding
queries. It allows for complex analysis of the steps that SQL Server has
to go through when you ask it to run a SQL query. Figure 5 shows the various
options available for viewing query-related information.
|
Figure
5. Viewing the available options in the Query menu of Query
Analyzer. (Click image to view larger version.) |
Of particular
interest is the option to Show Execution Plan. When this option is enabled,
SQL Server automatically displays the steps it had to perform to complete
a query every time you run one. Figure
6 provides an example for a somewhat complicated query. Note that you
can hover your mouse over any part of the diagram to view detailed statistics
about the operation that was performed.
|
Figure
6. Examining query execution plan information in Query Analyzer.
(Click image to view larger version.) |
By using
the Query Analyzer, you can pinpoint what operations within a query might
be taking a lot of time and server resources. For example, you might have
a large table that's missing an index, causing SQL Server to perform a
table scan every time you join to it (trust me, that's generally a bad
thing). In this case, Query Analyzer might show that 45 percent of the
time it took to complete the query was spent performing the table scan.
You can right-click on the particular operation and choose, "Manage Indexes".
A few well-placed mouse clicks, and you've instantly indexed the table!
Overall, Query Analyzer's additional features can save hours of headaches
(and customer support calls from users who are tired of getting up for
coffee every time they run a query).
Dealing with Caching
When optimizing queries, it's important to understand how you can
ensure reliable and consistent results. Let's discuss caching.
When you're
testing queries, you might run something once and notice that it takes
a long time. However, you run it again, and things look much better. Or
you run the query on one system and it seems to be going great, but the
same query takes forever on another machine. One reason for this is the
way that SQL Server caches query information. Therefore, anytime you want
to get consistent results on query performance, use the following two
commands:
- DBCC
DropCleanBuffers: Cleans out all cached query results that might
be stored in RAM from previous queries.
- DBCC
FreeProcCache: Cleans out the procedure cache, which may contain
plans optimized on previous running of queries.
Building Indexes with the Index Tuning Wizard
The efficient creation of optimal indexes is an art form that often
results in large paychecks for DBAs. The presence of optimal indexes can
dramatically improve query performance. However, it's possible to have
too much of a good thing. Indexing comes at a cost (disk space and the
resources required to build the index), so it's important that you choose
wisely. The trick becomes choosing what indexes are most worthwhile. In
the real world, this can be a huge undertaking for all but the smallest
applications. Fortunately, SQL Server's Index Tuning Wizard can dramatically
reduce the amount of effort it takes to find missing or desirable indexes.
The Index
Tuning Wizard can take a Profiler trace file as input and iteratively
execute the queries within it to determine the optimal indexing solution.
It can even consider the use of materialized views (in the case of SQL
Server 2000, Enterprise Edition). Figures 7 and 8 show the Index Tuning
Wizard in action. You can access the Index Tuning Wizard from within Enterprise
Manager, Query Analyzer or SQL Profiler. It just doesn't get much easier
than this!
|
Figure
7. Using the Index Tuning Wizard to find index recommendations.
|
|
Figure
8. Viewing details about expected performance improvements based
on suggestions from the Index Tuning Wizard. |
Choosing the Right Tool
As you've just seen, each of the above tools is useful for measuring
a specific aspect of overall database server performance. For example,
you might use Performance Monitor to find out whether or not your server
might benefit from a server upgrade (additional RAM or more CPUs, for
example). Suppose, you notice that load is high on one of your database
servers and the database server supports many databases. How can you find
out which databases are seeing the most load (so you can move the database
or target it for performance optimizations)? SQL Profiler is the right
tool for that job. You can choose to simply record a sample set of queries
during the day. You can store this information in a database and then
create your own SQL queries that pinpoint potential performance hotspots.
Finally,
suppose you isolate the problem to a few misbehaving queries. Now, it's
time to put on your development hat and try to optimize performance at
the code level. Query Analyzer offers many different features that can
assist in this otherwise unpleasant task. It can provide you with the
types of information you need to modify the query for increased performance.
If you determine that the problem is index-related, you might want to
run it through the Index Tuning Wizard's wringers to see what shakes out.
Once you get proficient with these steps, you'll find that you can isolate,
measure and optimize performance issues very quickly (you're optimizing
your own performance, so to speak).
The Bottom Line
If you've been paying any attention at all, you're probably already
aware that SQL Server 2000 provides database administrators and developers
with a wide array of performance monitoring and optimization tools. Although
the tools are easy to use, it's up to you to understand how to make the
best of them. Add Query Analyzer, SQL Profiler, Performance Monitor and
good query-writing habits to your bag of tricks, and you may never have
to worry about performance complaints again. Good luck and good speed!