Troubleshooting Performance Problems in SQL Server 2008 :
Goals :
The primary goal of this paper is to provide a general methodology for diagnosing and troubleshooting SQL Server performance problems in common customer scenarios by using publicly available tools.
SQL Server 2008 has made great strides in supportability. New dynamic management views (DMVs) have been added, like sys.dm_os_memory_brokers, sys.dm_os_memory_nodes, and sys.dm_exec_procedure_stats. Existing DMVs such as sys._dm_os_sys_info, sys.dm_exec_requests, and sys.dm_exec_requests have been enriched with additional information. You can use DMVs and existing tools, like SQL Server Profiler and Performance Monitor, to collect performance related data for analysis.
The secondary goal of this paper is to introduce new troubleshooting tools and features in SQL Server 2008, including Extended Events and the data collector.
Methodology :
There can be many reasons for a slowdown in SQL Server. We use the following three key symptoms to start diagnosing problems:
* Resource bottlenecks: CPU, memory, and I/O bottlenecks are covered in this paper. We do not consider network issues. For each resource bottleneck, we describe how to identify the problem and then iterate through the possible causes. For example, a memory bottleneck can lead to excessive paging that ultimately impacts performance.
* tempdb bottlenecks: Because there is only one tempdb for each SQL Server instance, it can be a performance and a disk space bottleneck. An application can overload tempdb through excessive DDL or DML operations and by taking too much space. This can cause unrelated applications running on the server to slow down or fail.
* A slow-running user query: The performance of an existing query might regress, or a new query might appear to be taking longer than expected. There can be many reasons for this. For example:
Resource Bottlenecks
Tools for Resolving
Resource Bottlenecks
General Troubleshooting Steps in Case of Memory
Errors:
Memory Errors
Goals :
The primary goal of this paper is to provide a general methodology for diagnosing and troubleshooting SQL Server performance problems in common customer scenarios by using publicly available tools.
SQL Server 2008 has made great strides in supportability. New dynamic management views (DMVs) have been added, like sys.dm_os_memory_brokers, sys.dm_os_memory_nodes, and sys.dm_exec_procedure_stats. Existing DMVs such as sys._dm_os_sys_info, sys.dm_exec_requests, and sys.dm_exec_requests have been enriched with additional information. You can use DMVs and existing tools, like SQL Server Profiler and Performance Monitor, to collect performance related data for analysis.
The secondary goal of this paper is to introduce new troubleshooting tools and features in SQL Server 2008, including Extended Events and the data collector.
Methodology :
There can be many reasons for a slowdown in SQL Server. We use the following three key symptoms to start diagnosing problems:
* Resource bottlenecks: CPU, memory, and I/O bottlenecks are covered in this paper. We do not consider network issues. For each resource bottleneck, we describe how to identify the problem and then iterate through the possible causes. For example, a memory bottleneck can lead to excessive paging that ultimately impacts performance.
* tempdb bottlenecks: Because there is only one tempdb for each SQL Server instance, it can be a performance and a disk space bottleneck. An application can overload tempdb through excessive DDL or DML operations and by taking too much space. This can cause unrelated applications running on the server to slow down or fail.
* A slow-running user query: The performance of an existing query might regress, or a new query might appear to be taking longer than expected. There can be many reasons for this. For example:
o Changes
in statistical information can lead to a poor query plan for an existing query.
o Missing
indexes can force table scans and slow down the query.
o An
application can slow down due to blocking even if resource utilization is
normal.
o
Excessive blocking can be
due to poor application or schema design or the choice of an improper
isolation level for the transaction.
The causes of these symptoms are not necessarily independent
of each other. The poor choice of a query plan can tax system resources and
cause an overall slowdown of the workload. So, if a large table is missing a
useful index, or if the query optimizer decides not to use it, the query can
slow down; these conditions also put heavy pressure on the I/O subsystem to
read the unnecessary data pages and on the memory (buffer pool) to store these
pages in the cache. Similarly, excessive recompilation of a frequently-run
query can put pressure on the CPU.
New Performance Tools in SQL Server 2008
SQL Server 2008 introduced new features and tools that you can use to
monitor and troubleshoot performance problems. We’ll discuss two features:
Extended Events and the data collector.
Resource Bottlenecks
The next sections of this paper discuss CPU, memory, and I/O
subsystem resources and how these can become bottlenecks. (Network issues are
outside of the scope of this paper.) For each resource bottleneck, we describe
how to identify the problem and then iterate through the possible causes. For
example, a memory bottleneck can lead to excessive paging, which can ultimately
impact performance.
Before you can determine whether you have a resource bottleneck,
you need to know how resources are used under normal circumstances. You can use
the methods outlined in this paper to collect baseline information about the
use of the resource (at a time when you are not having performance problems).
You might find that the problem is a resource that is running
near capacity and that SQL Server cannot support the workload in its
current configuration. To address this issue, you may need to add more processing
power or memory, or you may need to increase the bandwidth of your I/O or
network channel. However, before you take that step, it is useful to understand
some common causes of resource bottlenecks. Some solutions, such as
reconfiguration, do not require the addition of more resources.
Tools for Resolving
Resource Bottlenecks
One or more of the following tools can be used to resolve a
particular resource bottleneck:
·
Performance
Monitor: This tool is available as part of the Windows® operating system. For
more information, see your Windows documentation.
·
SQL
Server Profiler: See SQL Server
Profiler in the Performance Tools
group in the SQL Server 2008
program group.
·
DBCC
commands:
·
DMVs:
·
Extended
Events:
·
Data
collector and the management data warehouse (MDW):
General Troubleshooting Steps in Case of Memory
Errors:
The following list outlines general steps that will help you
troubleshoot memory errors.
1. Verify
that the server is operating under external memory pressure. If external
pressure is present, try resolving it first, and then see whether the problem
or errors still exist.
2. Collect
and compare performance counters as outlined in the previous sections.
3. Verify
the memory configuration parameters (sp_configure),
min memory per query, min/max server memory, awe enabled, and the Lock Pages in Memory privilege. Watch
for unusual settings. Correct them as necessary.
4. Check
for any nondefault sp_configure
parameters that might indirectly affect the server.
5. Check
for internal memory pressures.
6. Observe
DBCC MEMORYSTATUS output and the way it changes when you see memory error
messages.
7. Check
the workload (number of concurrent sessions, currently executing queries).
Memory Errors
701 - There
is insufficient system memory in resource pool 'pool_name' to run this query.
Causes
This is generic out-of-memory error for the server. It indicates
a failed memory allocation. It can be due to a variety of reasons, including
hitting memory limits on the current workload. With increased memory
requirements for SQL Server 2008 and SQL Server 2005 and certain
configuration settings (such as the max
server memory option and Resource Governor configuration settings), users
are more likely to see this error. Usually the transaction that failed is not
the cause of this error. Check the out_of_memory_count
column of the sys.dm_resource_governor_resource_pools DMV. If
this count is localized to a particular resource pool, Resource Governor
configuration is the most likely reason.
Troubleshooting
Regardless of whether the error is consistent and repeatable
(that is, it stays in the same state) or random (that is, it appears at random
times with different states), you should investigate server memory distribution
during the time you see this error. When this error is present, it is possible
that the diagnostic queries will fail. When you see this error, the best place
to start investigation is the error log. It should contain output that looks
something like this.
2009-01-28 04:27:15.43 spid51 Failed allocate pages:
FAIL_PAGE_ALLOCATION 1
or
2009-01-28 04:27:15.43 spid51 Failed Virtual Allocate Bytes:
FAIL_VIRTUAL_RESERVE 65536
The possible failures are:
·
FAIL_PAGE_ALLOCATION followed by the number of
pages attempted to allocate.
·
FAIL_VIRTUAL_RESERVE followed by the number of
bytes attempted to reserve.
·
FAIL_VIRTUAL_COMMIT followed by the number of
bytes attempted to commit.
Usually the task that first encountered the out-of-memory error
is not the task that caused the condition. Most likely it is a cumulative
effect of multiple tasks running. For the very common case of a single page
allocation failure, your investigation should take the global picture into
account.
The next piece of information from error log is the memory status
output. Depending on the failure, you should look for single page, multipage,
virtual reserved or committed numbers for individual memory clerks. Identifying
the biggest memory consumers is key to proceeding with investigation. You may
find that the biggest consumers are of the following type:
·
MEMORYCLERK_* means that the server
configuration or workload requires so much memory to be allocated. The
offending workload can sometimes be identified just by the memory clerks, but
more often you will have to drill further into the memory objects associated
with the clerks in order to find out what causes such memory consumption.
·
CACHESTORE_*, USERSTORE_*, OBJECTSTORE_* are the
types of caches. Big consumption by a cache may mean the following:
o Memory
is allocated out of the cache but is not inserted yet as an entry that can be
evicted. This is very similar to the MEMORYCLERK case discussed earlier.
o All
cache entries are in use so they cannot be evicted. You can confirm this by
looking at the sys.dm_os_memory_cache_counters
DMV and comparing the entries_count
and entries_in_use_count columns.
o Most
cache entries are not in use. This case most likely indicates a bug in the
server.
·
MEMORYCLERK_SQLQERESERVATIONS shows how much
memory has been reserved by the query execution (QE) to run queries with
sorts/joins.
The memory status output in the error log also
shows which Resource Governor resource pool memory is exhausted. The memory
brokers for every pool show the memory distribution between stolen
(compilation), cached, and reserved (granted) memory. The numbers for the three
brokers correspond to the three bullet points in the previous list.
Unfortunately there is no way to find out how much memory is allocated for a
pool from a given clerk or memory object. The sys.dm_os_memory_cache_entries DMV is extended to show the pool_id each entry is associated with
Possible solutions include the following:
·
Remove external memory pressure.
·
Increase the max server memory setting, and then adjust the MIN_MEMORY_PERCENT
and MAX_MEMORY_PERCENT settings for the resource pool.
·
Free caches by using one of the following
commands: DBCC FREESYSTEMCACHE, DBCC FREESESSIONCACHE, or DBCC FREEPROCCACHE.
If the problem reappears, reduce the workload.
802 - There is insufficient memory available in the buffer
pool.
Causes
This error does not necessarily indicate an out-of-memory
condition. It might indicate that the buffer pool memory is used by someone
else. In SQL Server 2008 and SQL Server 2005, this error should be
relatively rare.
Troubleshooting
Use the general troubleshooting steps and recommendations
outlined for the 701 error.
8628 - A time out occurred while waiting to optimize the
query. Rerun the query.
Causes
This error indicates that a query compilation process failed
because it was unable to obtain the amount of memory required to complete the
process. As a query undergoes through the compilation process, which includes
parsing, algebraization, and optimization, its memory requirements may
increase. Thus the query competes for memory resources with other queries. If
the query exceeds a predefined time-out (which increases as the memory
consumption for the query increases) while waiting for resources, this error is
returned. The most likely reason for this is the presence of a number of large
query compilations on the server.
Troubleshooting
·
Follow general troubleshooting steps to see
whether the server memory consumption is affected in general.
·
Check the workload. Verify the amounts of memory
consumed by different components. (For more information, see Internal Physical Memory Pressure earlier
in this paper.)
·
Check the output of DBCC MEMORYSTATUS for the
number of waiters at each gateway (this information will tell you whether there
are other queries running that consume significant amounts of memory).
Small Gateway Value
------------------------------ --------------------
Configured Units 8
Available Units 8
Acquires 0
Waiters 0
Threshold Factor 250000
Threshold 250000
(6 row(s) affected)
Medium Gateway Value
------------------------------ --------------------
Configured Units 2
Available Units 2
Acquires 0
Waiters 0
Threshold Factor 12
(5 row(s) affected)
Big Gateway Value
------------------------------ --------------------
Configured Units 1
Available Units 1
Acquires 0
Waiters 0
Threshold Factor 8
·
Check for Resource Governor configuration.
·
Reduce workload if possible.
8645 - A timeout occurred while waiting for memory
resources to execute the query in resource pool 'pool_name' (pool_id).
Rerun the query.
Causes
This error indicates that many concurrent memory-intensive
queries are being executed on the server. Queries that use sorts (ORDER BY) and
joins can consume significant amounts of memory during execution. Query memory
requirements are significantly increased if a high degree of parallelism is
enabled or if a query operates on a partitioned table with nonaligned indexes.
A query that cannot access the memory resources it requires within the
predefined time-out (by default, the time-out is 25 times the estimated query
cost, the sp_configure ‘query wait’
amount if it is set, or the Resource Governor workload group setting request_memory_grant_timeout_sec)
receives this error. Usually, the query that receives the error is not the one
that is consuming the memory.
Troubleshooting
·
Follow general steps to assess server memory
condition.
·
Identify problematic queries: Check to see
whether a significant number of queries operate on partitioned tables, check to
see whether they use nonaligned indexes, and check to see whether there are
many queries involving joins and/or sorts.
·
Check the sp_configure
parameters degree of parallelism and
min memory per query. Try reducing
the degree of parallelism and verify that min
memory per query is not set to a high value. If it is set to a high value,
even small queries will acquire the specified amount of memory.
·
Find out whether queries are waiting on
RESOURCE_SEMAPHORE. For more information, see Blocking
later in this paper.
·
Check Resource Governor configuration.
8651 - Could not perform the operation because the
requested memory grant was not available in resource pool '%ls' (%ld). Rerun the query, reduce the query load, or
check resource governor configuration setting.
Causes
Causes in part are similar to the 8645 error; this may also be an
indication of generally low memory conditions on the server. A min memory per query option setting
that is too high can also generate this error.
Troubleshooting
·
Follow general memory error troubleshooting
steps.
·
Verify that the sp_configure min memory per query option setting is not too high.
·
Check Resource Governor configuration settings.
No comments:
Post a Comment