Introduction
When a
sort operation is too large to fit in memory, Oracle allocates space in a
temporary tablespace in order to write data off to disk. Temporary space is a
resource shared by multiple sessions on the database, and quotas cannot be set
to limit how much temporary space can be used by any individual user or
session. If a user submits a query with an incomplete WHERE clause, an enormous
Cartesian product may result. That is all it takes to fill the temporary
tablespace and possibly impact many other users on the system. When the
temporary tablespace fills, any statement that requires additional temporary
space will fail with an “ORA-1652: unable to extend temp segment” error.
In
this paper, we will first briefly review how Oracle manages sorting operations.
Next we’ll discuss how a database administrator can determine if any statements
on the database have failed because the temporary tablespace ran out of space.
We’ll also present two techniques a DBA can use to understand how space in the
temporary tablespace is being used and how users are being impacted by a full
temporary tablespace.
The first technique we’ll look at includes how to direct
Oracle to log every statement that fails for lack of temporary space.
The
second technique provides a set of queries a DBA can run at any time to capture
in real time how temporary space is currently being used on a per-session or
per-statement basis.
These techniques can help DBAs address chronic or
intermittent shortages of temporary space.
Oracle Sorting Basics
Many
different circumstances can cause Oracle to sort data. For example, Oracle
sorts data when creating an index and when processing most queries that include
an ORDER BY or GROUP BY clause. Oracle sessions begin sorting data in memory.
If the amount of data being sorted is small enough, the entire sort will be
completed in memory with no intermediate data written to disk. When Oracle
needs to store data in a global temporary table or build a hash table for a
hash join, Oracle also starts the operation in memory and completes the task
without writing to disk if the amount of data involved is small enough. While
populating a global temporary table or building a hash is not a sorting
operation, we will lump all of these activities together in this paper because
they are handled in a similar way by Oracle.
If an
operation uses up a threshold amount of memory, then Oracle breaks the
operation into smaller ones that can each be performed in memory. Partial
results are written to disk in a temporary tablespace. The threshold for how
much memory may be used by any one session is controlled by instance
parameters. If the workarea_size_policy parameter is set to AUTO, then the
pga_aggregate_target parameter indicates how much memory can be used
collectively by all sessions for activities such as sorting and hashing. Oracle
will automatically assess and decide how much of this memory any individual
session should be allowed to use. If the workarea_size_policy parameter is set
to MANUAL, then instance parameters such as sort_area_size, hash_area_size, and
bitmap_merge_area_size dictate how much memory each session can use for these
operations.
Each
database user has a temporary tablespace (or temporary tablespace group in
Oracle 10g) designated in their user definition. Whenever a sort operation
grows too large to be performed entirely in memory, Oracle will allocate space
in the temporary tablespace designated for the user performing the operation.
You can see a user’s temporary tablespace designation by querying the dba_users
view.
Temporary
segments in temporary tablespaces—which we will call “sort segments”—are owned
by the SYS user, not the database user performing a sort operation. There
typically is just one sort segment per temporary tablespace, because multiple
sessions can share space in one sort segment. Users do not need to have quota
on the temporary tablespace in order to perform sorts on disk. In fact, quotas
on temporary tablespaces are ignored by Oracle.
Temporary
tablespaces can only hold sort segments. Oracle’s internal behavior is
optimized for this fact. For example, writes to a sort segment do not generate
redo or undo. Also, allocations of sort segment blocks to a specific session do
not need to be recorded in the data dictionary or a file allocation bitmap.
Why? Because data in a temporary tablespace does not need to persist beyond the
life of the database session that created it.
One
SQL statement can cause multiple sort operations, and one database session can
have multiple SQL statements active at the same time—each potentially with
multiple sorts to disk. When the results of a sort to disk are no longer
needed, its blocks in the sort segment are marked as no longer in use and can
be allocated to another sort operation.
A sort
operation will fail if a sort to disk needs more disk space and there are 1.)
no unused blocks in the sort segment, and 2.) no space available in the
temporary tablespace for the sort segment to allocate an additional extent.
This will most likely cause the statement that prompted the sort to fail with
the Oracle error, “ORA-1652: unable to extend temp segment.” This error message
also gets logged in the alert log for the instance.
It is
important to note that not all ORA-1652 errors indicate temporary tablespace
issues. For example, moving a table to a different tablespace with the ALTER
TABLE…MOVE statement will cause an ORA-1652 error if the target tablespace does
not have enough space for the table.
Identifying SQL Statements that Fail Due to Lack of
Temporary Space
It is
helpful that Oracle logs ORA-1652 errors to the instance alert log as it
informs a database administrator that there is a space issue. The error message
includes the name of the tablespace in which the lack of space occurred, and a
DBA can use this information to determine if the problem is related to sort
segments in a temporary tablespace or if there is a different kind of space
allocation problem.
Unfortunately,
Oracle does not identify the text of the SQL statement that failed. Thus we are
informed that a problem has occurred but we are not given tools with which to
identify the cause of the problem nor measure the user impact of the statement
failure.
However,
Oracle does have a diagnostic event mechanism that can be used to give us more
information whenever an ORA-1652 error occurs by causing Oracle server
processes to write to a trace file. This trace file will contain a wealth of
information, including the exact text of the SQL statement that was being
processed at the time that the ORA-1652 error occurred. This diagnostic event
imposes very little overhead on the system, because Oracle only writes
information to the trace file when an ORA-1652 error occurs.
You
can set a diagnostic event for the ORA-1652 error in your individual database
session with the following statement:
ALTER SESSION SET EVENTS '1652 trace name
errorstack';
You
can set the diagnostic event instance-wide with the following statement:
ALTER SYSTEM SET EVENTS '1652 trace name
errorstack';
The
above statement will affect the current instance only and will not edit the
server parameter file. That is to say, if you stop and restart the instance,
the diagnostic event setting will no longer be active. I don’t recommend
setting this diagnostic event on a permanent basis, but if you want to edit
your server parameter file, you could use a statement like the following:
ALTER SYSTEM SET EVENT = '1652 trace name
errorstack' SCOPE = SPFILE;
You
can also set diagnostic events in another session (without affecting all
sessions instance-wide) by using the “oradebug event” command in SQL*Plus.
You
can deactivate the ORA-1652 diagnostic event or remove all diagnostic event
settings from the server parameter file with statements such as the following:
ALTER SESSION SET EVENTS '1652 trace name context
off';
ALTER SYSTEM SET EVENTS '1652 trace name context off';
ALTER SYSTEM RESET EVENT SCOPE = SPFILE SID = '*';
ALTER SYSTEM SET EVENTS '1652 trace name context off';
ALTER SYSTEM RESET EVENT SCOPE = SPFILE SID = '*';
If a
SQL statement fails due to lack of space in the temporary tablespace and the
ORA-1652 diagnostic event has been activated, then the Oracle server process
that encountered the error will write a trace file to the directory specified
by the user_dump_dest instance parameter. The entry in the instance alert log
that indicates an ORA-1652 error occurred will also indicate that a trace file
was written. An entry in the instance alert log will look like this:
Tue Jan 2
17:21:14 2007
Errors in file /u01/app/oracle/admin/rpkprod/udump/rpkprod_ora_10847.trc:
ORA-01652: unable to extend temp segment by 128 in tablespace TEMP
Errors in file /u01/app/oracle/admin/rpkprod/udump/rpkprod_ora_10847.trc:
ORA-01652: unable to extend temp segment by 128 in tablespace TEMP
The
top portion of a sample trace file is as follows:
Oracle Database 10g Release 10.2.0.2.0 - 64bit
Production
ORACLE_HOME = /u01/app/oracle/product/10.2.0/db_2
System name: SunOS
Node name: rpk
Release: 5.8
Version: Generic_108528-27
Machine: sun4u
Instance name: rpkprod
Redo thread mounted by this instance: 1
Oracle process number: 18
Unix process pid: 10847, image: oracle@rpk (TNS V1-V3)
ORACLE_HOME = /u01/app/oracle/product/10.2.0/db_2
System name: SunOS
Node name: rpk
Release: 5.8
Version: Generic_108528-27
Machine: sun4u
Instance name: rpkprod
Redo thread mounted by this instance: 1
Oracle process number: 18
Unix process pid: 10847, image: oracle@rpk (TNS V1-V3)
*** ACTION NAME:() 2007-01-02
17:21:14.871
*** MODULE NAME:(SQL*Plus) 2007-01-02 17:21:14.871
*** SERVICE NAME:(SYS$USERS) 2007-01-02 17:21:14.871
*** SESSION ID:(130.13512) 2007-01-02 17:21:14.871
*** 2007-01-02 17:21:14.871
ksedmp: internal or fatal error
ORA-01652: unable to extend temp segment by 128 in tablespace TEMP
Current SQL statement for this session:
SELECT "A1"."INVOICE_ID", "A1"."INVOICE_NUMBER", "A1"."INVOICE_DAT
E", "A1"."CUSTOMER_ID", "A1"."CUSTOMER_NAME", "A1"."INVOICE_AMOUNT",
"A1"."PAYMENT_TERMS", "A1"."OPEN_STATUS", "A1"."GL_DATE", "A1"."ITE
M_COUNT", "A1"."PAYMENTS_TOTAL"
FROM "INVOICE_SUMMARY_VIEW" "A1"
ORDER BY "A1"."CUSTOMER_NAME", "A1"."INVOICE_NUMBER"
----- Call Stack Trace -----
*** MODULE NAME:(SQL*Plus) 2007-01-02 17:21:14.871
*** SERVICE NAME:(SYS$USERS) 2007-01-02 17:21:14.871
*** SESSION ID:(130.13512) 2007-01-02 17:21:14.871
*** 2007-01-02 17:21:14.871
ksedmp: internal or fatal error
ORA-01652: unable to extend temp segment by 128 in tablespace TEMP
Current SQL statement for this session:
SELECT "A1"."INVOICE_ID", "A1"."INVOICE_NUMBER", "A1"."INVOICE_DAT
E", "A1"."CUSTOMER_ID", "A1"."CUSTOMER_NAME", "A1"."INVOICE_AMOUNT",
"A1"."PAYMENT_TERMS", "A1"."OPEN_STATUS", "A1"."GL_DATE", "A1"."ITE
M_COUNT", "A1"."PAYMENTS_TOTAL"
FROM "INVOICE_SUMMARY_VIEW" "A1"
ORDER BY "A1"."CUSTOMER_NAME", "A1"."INVOICE_NUMBER"
----- Call Stack Trace -----
From
the trace file you can clearly see the full text of the SQL statement that
failed. You can also see when it failed along with attributes of the database
session such as module, action, and service name. It is important to note that
the statements captured in trace files with this method may not themselves be
the cause of space issues in the temporary tablespace. For example, one query
could run successfully and consume 99.9% of the temporary tablespace due to a
Cartesian product, while a second query fails when trying to allocate just a
small amount of sort space. The second query is the one that will get captured
in a trace file, while the first query is more likely to be the root cause of
the problem.
The
trace file will contain additional information, including a call stack trace
and a binary stack dump. This information is not likely to be useful, unless
perhaps you want to learn more about Oracle internals.
The
diagnostic event facility has been built into the Oracle database product for a
very long time, but it is not widely documented. Oracle Support’s position
appears to be that you should not use this facility unless directed to do so by
Oracle Support. There are certain widely-known diagnostic events such as 10046
for extended SQL trace and 10053 for tracing the cost-based optimizer, and
there are certain events that can alter Oracle’s behavior significantly. In
general, you absolutely should not try setting diagnostic events in a
production database unless you have a very good idea of what they do.
Although
I am not aware of an Oracle Support document that officially blesses setting
diagnostic event 1652 for identifying SQL statements that fail due to lack of
sort space, there are bulletins on Metalink that do show how to set events to
dump an error stack for basic Oracle errors. Metalink document 217274.1, for
example, shows how to set a diagnostic event for the ORA-942 (“table or view
does not exist”) error. We are doing the exact same thing here for the ORA-1652
error, and therefore it seems like a relatively safe thing to do.
Like
most debugging or diagnostic facilities, you should only use the ORA-1652
diagnostic event to the extent you really need to. If you regularly get
ORA-1652 errors in one batch job and you can add an ALTER SESSION statement to
the beginning of the batch job, then doing so would be preferable to setting
the diagnostic event at the instance-level. Typically there shouldn’t be a need
to set this diagnostic event at the instance level on a permanent basis or in
the server parameter file.
Monitoring Temporary Space Usage
Instead
of waiting for a temporary tablespace to fill and for statements to fail, you
can monitor temporary space usage in the database in real time. At any given
time, Oracle can tell you about all of the database’s temporary tablespaces,
sort space usage on a session basis, and sort space usage on a statement basis.
All of this information is available from v$ views, and the queries shown in
this section can be run by any database user with DBA privileges.
Temporary Segments
The
following query displays information about all sort segments in the database.
(As a reminder, we use the term “sort segment” to refer to a temporary segment
in a temporary tablespace.) Typically, Oracle will create a new sort segment
the very first time a sort to disk occurs in a new temporary tablespace. The
sort segment will grow as needed, but it will not shrink and will not go away
after all sorts to disk are completed. A database with one temporary tablespace
will typically have just one sort segment.
SELECT
A.tablespace_name tablespace, D.mb_total,
SUM (A.used_blocks * D.block_size) / 1024 / 1024 mb_used,
D.mb_total - SUM (A.used_blocks * D.block_size) / 1024 / 1024 mb_free
FROM v$sort_segment A,
(
SELECT B.name, C.block_size, SUM (C.bytes) / 1024 / 1024 mb_total
FROM v$tablespace B, v$tempfile C
WHERE B.ts#= C.ts#
GROUP BY B.name, C.block_size
) D
WHERE A.tablespace_name = D.name
GROUP by A.tablespace_name, D.mb_total;
SUM (A.used_blocks * D.block_size) / 1024 / 1024 mb_used,
D.mb_total - SUM (A.used_blocks * D.block_size) / 1024 / 1024 mb_free
FROM v$sort_segment A,
(
SELECT B.name, C.block_size, SUM (C.bytes) / 1024 / 1024 mb_total
FROM v$tablespace B, v$tempfile C
WHERE B.ts#= C.ts#
GROUP BY B.name, C.block_size
) D
WHERE A.tablespace_name = D.name
GROUP by A.tablespace_name, D.mb_total;
The
query displays for each sort segment in the database the tablespace the segment
resides in, the size of the tablespace, the amount of space within the sort
segment that is currently in use, and the amount of space available. Sample
output from this query is as follows:
TABLESPACE MB_TOTAL MB_USED
MB_FREE
------------------------------- ---------- ---------- ----------
TEMP 10000 9 9991
------------------------------- ---------- ---------- ----------
TEMP 10000 9 9991
This
example shows that there is one sort segment in a 10,000 Mb tablespace called
TEMP. Right now, 9 Mb of the sort segment is in use, leaving a total of 9,991
Mb available for additional sort operations. (Note that the available space may
consist of unused blocks within the sort segment, unallocated extents in the
TEMP tablespace, or a combination of the two.)
Sort Space Usage by Session
The
following query displays information about each database session that is using
space in a sort segment. Although one session may have many sort operations
active at once, this query summarizes the information by session. This query
will need slight modification to run on Oracle 8i databases, since the
dba_tablespaces view did not have a block_size column in Oracle 8i.
SELECT S.sid
|| ',' || S.serial# sid_serial, S.username, S.osuser, P.spid,
S.module,
S.program, SUM (T.blocks) * TBS.block_size / 1024 / 1024 mb_used, T.tablespace,
COUNT(*) sort_ops
FROM v$sort_usage T, v$session S, dba_tablespaces TBS, v$process P
WHERE T.session_addr = S.saddr
AND S.paddr = P.addr
AND T.tablespace = TBS.tablespace_name
GROUP BY S.sid, S.serial#, S.username, S.osuser, P.spid, S.module,
S.program, TBS.block_size, T.tablespace
ORDER BY sid_serial;
S.program, SUM (T.blocks) * TBS.block_size / 1024 / 1024 mb_used, T.tablespace,
COUNT(*) sort_ops
FROM v$sort_usage T, v$session S, dba_tablespaces TBS, v$process P
WHERE T.session_addr = S.saddr
AND S.paddr = P.addr
AND T.tablespace = TBS.tablespace_name
GROUP BY S.sid, S.serial#, S.username, S.osuser, P.spid, S.module,
S.program, TBS.block_size, T.tablespace
ORDER BY sid_serial;
The
query displays information about each database session that is using space in a
sort segment, along with the amount of sort space and the temporary tablespace
being used, and the number of sort operations in that session that are using
sort space. Sample output from this query is as follows:
SID_SERIAL USERNAME OSUSER SPID MODULE PROGRAM MB_USED TABLESPACE SORT_OPS
---------- -------- ------ ---- ------ --------- ------- ---------- --------
33,16998 RPK_APP rpk 3061 inv httpd@db1 9 TEMP 2
---------- -------- ------ ---- ------ --------- ------- ---------- --------
33,16998 RPK_APP rpk 3061 inv httpd@db1 9 TEMP 2
This
example shows that there is one database session using sort segment space.
Session 33 with serial number 16998 is connected to the database as the RPK_APP
user. The connection was initiated by the httpd@db1 process running under the
rpk operating system user, and the Oracle server process has operating system
process ID 3061. The application has identified itself to the database as
module “inv.” The session has two active sort operations that are using a total
of 9 Mb of sort segment space in the TEMP tablespace.
Sort Space Usage by Statement
The
following query displays information about each statement that is using space
in a sort segment. This query will need slight modification to run on Oracle 8i
databases, since the dba_tablespaces view did not have a block_size column in
Oracle 8i.
SELECT S.sid
|| ',' || S.serial# sid_serial, S.username,
T.blocks * TBS.block_size / 1024 / 1024 mb_used, T.tablespace,
T.sqladdr address, Q.hash_value, Q.sql_text
FROM v$sort_usage T, v$session S, v$sqlarea Q, dba_tablespaces TBS
WHERE T.session_addr = S.saddr
AND T.sqladdr = Q.address (+)
AND T.tablespace = TBS.tablespace_name
ORDER BY S.sid;
T.blocks * TBS.block_size / 1024 / 1024 mb_used, T.tablespace,
T.sqladdr address, Q.hash_value, Q.sql_text
FROM v$sort_usage T, v$session S, v$sqlarea Q, dba_tablespaces TBS
WHERE T.session_addr = S.saddr
AND T.sqladdr = Q.address (+)
AND T.tablespace = TBS.tablespace_name
ORDER BY S.sid;
The
query displays information about each statement using space in a sort segment,
including information about the database session that issued the statement and
the temporary tablespace and amount of sort space being used. Sample output
from this query is as follows:
SID_SERIAL USERNAME MB_USED TABLESPACE ADDRESS HASH_VALUE
---------- -------- ------- ---------- ---------------- ----------
SQL_TEXT
--------------------------------------------------------------------------------
33,16998 RPK_APP 8 TEMP 000000038865B058 3641290170
SELECT * FROM NOTIFY_MESSAGES NM WHERE NM.AWAITING_SENDING = 'y' AND NOT EXISTS
( SELECT 1 FROM NOTIFY_MESSAGE_GROUPS NMG WHERE NMG.MESSAGE_GROUP_ID = NM.MESSAG
E_GROUP_ID AND NMG.INCOMPLETE = 'y' ) ORDER BY NM.NOTIFY_MESSAGE_ID
---------- -------- ------- ---------- ---------------- ----------
SQL_TEXT
--------------------------------------------------------------------------------
33,16998 RPK_APP 8 TEMP 000000038865B058 3641290170
SELECT * FROM NOTIFY_MESSAGES NM WHERE NM.AWAITING_SENDING = 'y' AND NOT EXISTS
( SELECT 1 FROM NOTIFY_MESSAGE_GROUPS NMG WHERE NMG.MESSAGE_GROUP_ID = NM.MESSAG
E_GROUP_ID AND NMG.INCOMPLETE = 'y' ) ORDER BY NM.NOTIFY_MESSAGE_ID
33,16998 RPK_APP 1 TEMP 00000003839FFE20 1874671316
select * from rpk_stat where sample_group_id = :b1 order by stat#, seq#
select * from rpk_stat where sample_group_id = :b1 order by stat#, seq#
This
example shows that session 33 with serial number 16998, connected to the
database as the RPK_APP user, has two statements currently using sort segment
space in the TEMP tablespace. One statement is currently using 8 Mb of sort
segment space, while the other is using 1 Mb. The text of each statement, along
with its hash value and address in the shared SQL area are also displayed.
Conclusion
When
an operation such as a sort, hash, or global temporary table instantiation is
too large to fit in memory, Oracle allocates space in a temporary tablespace
for intermediate data to be written to disk. Temporary tablespaces are a shared
resource in the database, and you can’t set quotas to limit temporary space
used by one session or database user. If a sort operation runs out of space,
the statement initiating the sort will fail. It may only take one query missing
part of its WHERE clause to fill an entire temporary tablespace and cause many
users to encounter failure because the temporary tablespace is full.
It is
easy to detect when failures have occurred in the database due to a lack of
temporary space. With the setting of a simple diagnostic event, it is also easy
to see the exact text of each statement that fails for this reason. There are
also v$ views that DBAs can query at any time to monitor temporary tablespace
usage in real time. These views make it possible to identify usage at the
database, session, and even statement level.
Oracle
DBAs can use the techniques outlined in this paper to diagnose temporary
tablespace problems and monitor sorting activity in a proactive way. These
tactics can be helpful for addressing both chronic and intermittent shortages
of temporary space.
----------------All of the blogs are for my own reference only---------------------
No comments:
Post a Comment