Oracle Magazine Issue Archive
2011
September 2011
TECHNOLOGY: Support
Troubleshooting Internal ErrorsBy Tamzin Oscroft
A guide to assessing and resolving ORA-600 and ORA-7445 errors If you’re an Oracle DBA, you’re likely to have come across an error message in your Oracle Database alert.log files prefixed by either ORA-600 or ORA-7445, such as Thu Jan 20 13:35:52 2011 Errors in file /DATA/oracle/admin/ prod/udump/prod_ora_2131.trc: ORA-00600: internal error code, arguments: [ktfbtgex-7], [1015817], [1024], [1015816], [], [], [], [] This column explains what you can do to assess some ORA-600 or ORA-7445 errors and identify solutions. ORA-600 or ORA-7445: What Is the Difference?ORA-600 is a catchall message that indicates an error internal to the database code. The key point to note about an ORA-600 error is that it is signaled when a code check fails within the database. At points throughout the code, Oracle Database performs checks to confirm that the information being used in internal processing is healthy, that the variables being used are within a valid range, that changes are being made to a consistent structure, and that a change won’t put a structure into an unstable state. If a check fails, Oracle Database signals an ORA-600 error and, if necessary, terminates the operation to protect the health of the database. The first argument to the ORA-600 error message indicates the location in the code where the check is performed; in the example above, that is ktfbtgex-7 (which indicates that the error occurred at a particular point during tablespace handling). The subsequent arguments have different meanings, depending on the particular check. An ORA-7445 error, on the other hand, traps a notification the operating system has sent to a process and returns that notification to the user. Unlike the ORA-600 error, the ORA-7445 error is an unexpected failure rather than a handled failure. The Oracle function in which that notification signal is received is usually, from Oracle Database 10g onward, contained in the ORA-7445 error message itself. For example, in the error message ORA-07445: exception encountered: core dump [kocgor()+96] [SIGSEGV] [ADDR:0xF000000104] [PC:0x861B7EC] [Address not mapped to object] [] Both ORA-600 and ORA-7445 errors will
Often you will see multiple errors reported within the space of a few minutes, typically starting with an ORA-600. It is usually, but not always, the case that the first is the significant error and the others are side effects. Can I Resolve These Errors Myself?For some ORA-600 and ORA-7445 errors, you can either identify the cause and resolve the error on your own or find ways to avoid the error. The information provided in this section will help you resolve or work around some of the more common errors. ORA-600 [729]. The first argument to this ORA-600 error message, 729, indicates a memory-handling issue. The error message text will always include the words space leak, but the number after 729 will vary: ORA-00600: internal error code, arguments: [729], [800], [space leak], [], [], You cannot determine the cause of the space leak by checking your application code, because the error is internal to Oracle Database. You can, however, safely avoid the error by setting an event in the initialization file for your database in this form: event="10262 trace name context forever, level xxxx" or by executing the following: SQL>alter system set events '10262 trace name context forever, level xxxx' scope=spfile; Replace xxxx with a number greater than the value in the second set of brackets in the ORA-600 [729] error message. In the example above, you could set the number to 1000, in which case the event instructs the database to ignore all user space leaks that are smaller than 1000 bytes. ORA-600 [kddummy_blkchk]. The kddummy_blkchk argument indicates that checks on the physical structure of a block have failed. This error is reported with three additional arguments: the file number, the block number, and an internal code indicating the type of issue with that block. The following is an alert.log excerpt for an ORA-600 [kddummy_blkchk] error: Errors in file/u01/oracle/admin/ PROD/bdump/prod_ora_11345.trc: ORA-600: internal error code, arguments: [kddummy_blkchk], [2], [21940], [6110], [], [], [], [] The ORA-600 [kddummy_blkchk] error message usually reports a corruption, so to identify the object involved, first use the file number (&afn) and the block number (&bn) reported in the error message in the SQL query: select * from dba_extents where file_id=&afn and &bn between block_id and block_id + blocks -1; If the query returns a table, confirm the corruption by executing SQL>analyze table <tablename> validate structure; SQL>analyze index <indexname> validate structure; ORA 1498 "block check failure - see trace file" ORA-600 [6033]. This error is reported with no additional arguments, as shown in the following alert.log file excerpt: Errors in file/u01/oracle/admin/PROD/ bdump/prod_ora_2367574.trc: ORA-600: internal error code, arguments: [6033], [], [], [], [], [], [], [] There are two possible ways to identify the table on which the affected index is built:
------------------------------------------------------------------------------- |Id |Operation |Name |Rows |Bytes |Cost |Time | ------------------------------------------------------------------------------- |0 |SELECT STATEMENT | | | |883K | | |1 | WINDOW NOSORT | |2506K | 318M |883K |04:31:56| |2 | SORT GROUP BY | |2506K | 318M |883K |04:31:56| |3 | HASH JOIN RIGHT OUTER | |2506K | 318M |837K |03:20:05| |4 | VIEW |index$_join$_006| 8777 | 257K | 35 |00:00:01| |5 | HASH JOIN | | | | | | |6 | INDEX FAST FULL SCAN |XC179S1 | 8777 | 257K | 18 |00:00:01| |7 | INDEX FAST FULL SCAN |XC179P0 | 8777 | 257K | 25 |00:00:01| |8 | VIEW | |2506K | 245M |837K |03:20:04| |9 | HASH JOIN OUTER | |2506K | 296M |837K |03:20:04| |10 | HASH JOIN | |2506K | 184M |454K |02:49:32| |11 | TABLE ACCESS FULL |TESTTAB1 |2484K | 102M |361K |01:26:13| |12 | TABLE ACCESS FULL |TESTTAB2 | 21M | 688M | 49K |00:12:37| |13 | TABLE ACCESS FULL |TESTTAB3 | 94M |4326M |146K |00:35:51| For each of the tables used by the SQL statement that was executing at the time of the error, execute the following statement: SQL>analyze table <tablename> validate structure cascade; This will check to ensure that every value in the index is also in the table, and vice versa. If it finds a mismatch, it will report ORA-1499 table/Index Cross Reference Failure - see trace file The trace file will be in the location indicated by the user_dump_dest or diagnostic_dest initialization parameter and will contain information similar to row not found in index tsn: 8 rdba: 0x04d01348 You can then find the index with by using the query in Listing 2. Replace the &rdba and &tsn values in Listing 2 with the appropriate values. For this example, the &rdba value is the rdba from the trace file with the 0x portion removed and &tsn is the tablespace number (tsn) from the trace file. (&rdba in this case would be 04d01348, and &tsn would be 8.) Code Listing 2: Find the index
SELECT owner, segment_name, segment_type, partition_name
FROM DBA_SEGMENTS
WHERE header_file = (SELECT file#
FROM v$datafile
WHERE rfile# = dbms_utility.data_block_address_file(to_
number('&rdba','XXXXXXXX'))
AND ts#= &tsn)
AND header_block = dbms_utility.data_block_address_block(to_
number('&rdba','XXXXXXXX'));
Once you have identified the index, drop and re-create it. It is important to drop and re-create the index rather than rebuilding it online, because only re-creating it will reinitialize the values in the index. ORA-7445 [xxxxxx] [SIGBUS] [OBJECT SPECIFIC HARDWARE ERROR]. This ORA-7445 error can occur with many different functions (in place of xxxxxx). For example, the following alert.log excerpt shows the failing function as ksxmcln. /u01/app/oracle/admin/prod/bdump/ prod_smon_8201.trc: ORA-7445: exception encountered: core dump [ksxmcln()+0] [SIGBUS] [object specific hardware error] [6822760] [] [] The important part of this error is the ”object specific hardware error” argument, which indicates that there were insufficient operating system resources to complete the action. The most common resources involved are swap and memory. To diagnose the cause of an ORA-7445 error, you should first check the operating system error log; for example, in Linux this error log is /var/log/messages. Within the error log, look for information with the same time stamp as the ORA-7445 error (this will be in the alert.log next to the error message). You will often find an error message similar to Jun 9 19:005:05 PRODmach1 genunix: [ID 470503 kern.warning] WARNING: Sorry, no swap space to grow stack for pid 9632
Once you’ve identified the resource that affects the running of the statement, increase the amount of that resource available to Oracle Database. SummaryBy following the instructions in this article, you should be able to resolve some errors that are caused by underlying physical issues such as file corruption or insufficient swap space. But because ORA-600 and ORA-7445 errors are internal, many cannot be resolved by user-led troubleshooting. For those Oracle Database users with Oracle support contracts, however, additional knowledge content is available via My Oracle Support. Most notably, the ORA-600/ORA-7445 lookup tool [Knowledge Article 153788.1], shown in Figure 1, enables you to enter the first argument to an ORA-600 or ORA-7445 error message and use that information to identify known defects, workarounds, and other knowledge targeted specifically to that error/argument combination.
Figure 1: ORA-600/ORA-7445 lookup tool interface Tamzin Oscroft is a senior principal support engineer in Oracle Database Support. She has worked for Oracle Global Software Support for 16 years. |