Translate

Friday, March 14, 2008

RE-ENGINEERING YOUR ASM RAC ARCHITECTURE FOR COST SAVINGS

If you have been an ASM pioneer under Oracle10g, by now you have probably dealt with some of the performance tuning and performance optimization issues that a DBA can find in a production environment. For instance, if you are implemented an architecture to handle large objects, but chose an inappropriate block size for your cluster, set up a tiny sort area size, and unintentionally sized your shared pool, java pool, and large pool or have not established an overall Automatic Storage Memory Management (ASMM) sizing policy, you could be dealing now not only with uncontrolled temporary and undo segment management. Unfortunately, for the pioneers, this is extremely risky, especially when you chose to use the EXTERNAL REDUNDANCY option at a time when your database has reached a performance threshold that allows both of your database architecture and instance configuration to do its utmost for the last few days. So at this point your backup policy is on the edge of collapse. I say unfortunately, because the best and possibly the unique option a DBA may have is to use Oracle ASM SQL API. So you want to redesign your sorting strategy to deal with range index scanning, aggregation and other BI querying, and ordering, and now you need to plan for downtime to modify your SORT_AREA_SIZE parameter. So, your production database need to show your ASM disk groups as MOUNTED, but they just seem show to be listed as CONNECTED in your ASM views, such as V$ASM_DISK, V$ASM_DISK_STAT, V$ASM_DISKGROUP and so on. Your cluster database at great risk, so your Enterprise Manager Grid Control frequently reports a large number of related errors, and so does your other monitoring tool… But at this point fixing all these errors is significantly risky, since you are handling a very large number of gigabytes of temporary and undo segments in both types of tablespaces that go beyond the few hundreds altogether. This perhaps means that the undo auto-tuning column in your V$UNDOSTAT view, could show the equivalent of several days when in fact you do not use any FLASHBACK options and do not require such an amount of UNDO tablespace. In summary, you need to consider:

  • Moving from EXTERNAL to NORMAL redundancy, and persuade your IT manager to make the investment to mitigate the risk of a major production downtime.
  • Establishing an improved architecture to improve performance on queries requiring extensive aggregation, indexed range scanning, and sorting, by property sizing your sort area size, in particular, when LOBs are involved in the process.
  • That, by following the previous option, your new architecture can avoid massive usage of temporary segments, which translate in significant reduction in temporary tablespace, better usage of swap (virtual) memory, and subsequently improved automatic undo management, reduction in I/O and latch contention, among other benefits.
  • Migrating to Oracle11g better sooner than later.

If your temporary and undo tablespace keep growing exponentially, in spite of your one-quarter behind patching (no so bad, just one quarter behind), the ASM SQL API is useful. This is an example of you could do to build an improved architecture.

Start with your temporary segment and sorting strategy. For instance, use the following commands, namely:

ALTER SYSTEM SET SORT_AREA_SIZE = 10485760 DEFERRED;

CREATE TEMPORARY TABLESPACE TEMP2
TEMPFILE ‘+DGADN1’ SIZE 8192M
EXTENT MANAGEMENT LOCAL UNIFORM SIZE 2M
SEGMENT SPACE MANAGEMENT AUTO;

ALTER DATABASE DEFAULT TEMPORARY TABLESPACE TEMP2;

After some time you could either destroy your old temporary tablespace for good at this time. You could play with this and the next command to maintain the name using an AUX_TEMP tablespace name to regain the original temporary tablespace name or use the RENAME TABLESPACE option, if applicable to your version.

DROP TABLESPACE old_temp INCLUDING CONTENTS AND DATAFILES;

Or you could attempt to create a temporary tablespace group, which is quite advisable in a cluster database environment.

Similarly, apply the following command:

CREATE UNDO TABLESPACE UNDOTBS2
DATAFILE ‘+DGADN1’ SIZE 8192M
EXTENT MANAGEMENT LOCAL
SEGMENT SPACE MANAGEMENT AUTO;

ALTER SYSTEM SET UNDO_TABLESPACE=UNDOTBS3;

Note that the undo tablespace is associated with the instance while the temporary tablespace is associated with the database itself, and it will be listed among the database properties in the DATABASE_PROPERTIES view, as the DEFAULT_TEMPORARY_TABLESPACE in conjunction with the default user’s tablespace.
You could repeat this step to create a new undo tablespace for each instance in your ASM cluster database, and recycle each one of the badly design undo pattern because even physical factor (e.g., how your physical segments are mapped to your tablespaces) causing unbalance access to those segments among different instances. Or you could try swapping the ownership of the existing segments seeking to balance the outcome at least.

If you are not using FLASHACK (Oracle11g, Oracle10g, or Oracle9) or RAT/Database Replay (Oracle 11g only) you need to wait for at least the equivalent of your undo retention and apply the following command. It is best if you wait for the time equivalent to the TUNED_UNDORETENTION column in the v$UNDOSTAT view, which is given in seconds.

DROP TABLESPACE undotbs1 INCLUDING CONTENTS AND DATAFILES;

If Automatic Undo Management is not working as desirable, start a pro-active planning of your database architecture re-engineering renewal rather than waiting for a possible database collapse in the short-term, one that could prove to be costly either whether business continuity is fully affected or not, simply making your environment exceedingly expensive from storage point of view.

For some DBAs, these options could be rather innovative and seldom used before, but they will definitely prevent further investment in unnecessary storage space.

Remember that, in spite of the unlimited functionality and usability provided by your Enterprise Management Grid Control interface, an interesting and challenging aspect of ASM technology is that scenarios as the one presented here entice the usage of Oracle’s SQL API as a unique option rather than any Operating System capability, which in most instances is rather useless.

N.B. The opinions, comments, and experience presented in this article are solely those of the author.

No comments: