Translate

Wednesday, March 23, 2011

Performance Tuning for the Storage Layer

Intelligent Storage for High-Performance Computing (HPC)


The following is an excerpt of my intelligent storage paper using ASM technologies, involving ASM Cluster Files System (ACFS), ASM Dynamic Volume Manager (ADVM), and Intelligent Data Placement (IDP).

The articles proposes an array of recommendations and practical strategies to optimize server performance from the storage layer up. The discussion focuses on the usage of traditional disk drives (HDD) and Solid State Drives (SSD).


Implementation on Traditional HDD Disk Arrays


This is done by creating a custom template and applying to appropriate ASM datafile and subsequently determining relevance to logical objects. The SQL API is shown in the example below:

ALTER DISKGROUP adndata1 ADD TEMPLATE datafile_hot
ATTRIBUTE ( HOT MIRRORHOT);

ALTER DISKGROUP adndata1 MODIFY FILE '+DATA/ADN3/DATAFILE/tools.256.765689507'
ATTRIBUTE ( HOT MIRRORHOT);

Dual Pairing Drive (Hybrid) Solution

This solution seeks the following goals:

SSD for performance

Whenever SSD become part of the storage performance tuning strategy, it is necessary to plan on a more consistent space management, meaning that overprovisioning is an critical requirement to attain optimal storage elasticity. This is primarily due to the asymmetry of disk read/write speed on this technology. Depending on the disk controller used by its vendor, an SSD can have a 10:1 normalized ratio when experimentally tests have been conducted by various vendors where reading tasks represent about 65% of the entire transactional workload. Overprovisioning, in general, can also account for a reduction in manual rebalancing tasks when they seem to be required, such as moving an ASM file from one disk group to another either via PL/SQL packages or RMAN, could result in the disruption of production tasks. Similarly, overprovisioning can prevent the moving of logical objects across the database to attain better performance optimization.

Thus, both DRAM SSDs and NAND SSDs have much better IOPS costs than high performance hard disks (HDDs). Besides, Flash SSDs are very well suited to read-only applications and mobility applications, and they are highly cost-effective in industries such as media, where its default storage format UFS (Universal File System) is widely utilized.

Besides ASM can reciprocally help SSD technology by ensuring that the writing of data is spread out evenly across the storage space over time. Consequently, to facilitate wear leveling, SSD controllers implement a virtual-to-physical mapping of Logical Block Addressing (LBA) to Physical Block Addressing (PBA) on the Flash media and sophisticated error correction codes.

HDD for capacity

Traditional hard disk drives (HDDs) can be used to attain the implementation of low-cost high-capacity volumes. Therefore, the hard disks have a cost per gigabyte which is significantly better than solid state.

Hybrid Pools: Implementation on HHDD

Where Disk Groups are transparently built on the mix of these technologies.

In this scenario, it is important to establish the following considerations:

Establish the hybrid ASM read preferred read groups

SQL> ALTER SYSTEM SET ASM_PREFERRED_READ_FAILURE_GROUPS= 'HYBRID.HHDD';

Decide how to best use the SSD/Flash technology for speed and performance of database online redo logs.1 Set the logs on pure SSD drives. In doing so, this could imply that the Oracle DBA Architect might need at least to create a disk group or set of disk groups solely on SSD drives2.

Consistently, tables can be associated with tablespaces that have been designed on the basis of ASM data files, which have already been set as mirror hot or mirror cold templates, accordingly. These maintenance activities can occur as a baseline, for instance, when the database is created (first loaded), migrated, or upgraded accordingly, and systematically when a relevant event occurs. For instance, a partitioned table, using partition pruning, will require to change the most active partition due to a partition interval or time range issue, which will make a different or new partition the most dynamic segment, which in turn may require to be set in a mirror hot sector for optimal performance. In general, business algorithms can be customized, implemented, and set for periodic ASM storage maintenance in order to attain optimal performance continuously.

Introducing Best Practices for Oracle ASM IDP and Dynamic Storage Tuning

The models presented here can easily be enabled in conjunction with practical industry cloud storage implementations, which apply to both the Open Cloud Computing Interface (OCCI) and Cloud Data Management Interface (CDMI) standards. For Oracle practitioners, it is important to use and apply existing environment knowledge, such as derived from Real application Testing (RAT) prior to accomplish the actual implementation.


The Hot-or-Cached Storage Optimization Model (HoCSOM)

In this model, the purpose is to explicitly customize access and optimize performance for specific database objects, in particular, tables, indexes, materialized views, and all types of object partitions, namely, table, index or materialized view partitions, which have critical current relevance directly impacting performance tuning.

The following rules are reasonably practical in order to attain an adaptive optimal tuning, as follows:

Classify the set of database objects in at least three hierarchical tiers based on segment size and segment utilization levels.

Implement operating procedures to attain a smart database object allocation in the appropriate physical and logical space in an optimal fashion with a dynamic Oracle Automatic Storage Management perspective, as follows:

Tier 1 should include a small number of objects, in comparison to the total database object conglomerate, which require special storage handling due to size and activity.

Tier 2 will include average table in both size and usage or large tables with low level of usage, and average actively small tables.

Tier 3 will include smaller tables, and rather inactive tables, in general.

Tier-Based Storage Optimization Model (TBSOM)

 This model will either utilize a combination of SSD (NAND/Flash) and Traditional HDD disk arrays in order to implement the relevant object-storage devices (OSD), i.e., Oracle ASM disk and disk groups.

The Flash capability serves partly as a way to overprovision the SSD technology when update of blocks could become expensive, as they require that the entire block is first deleted and then rewritten in full. The Flash acts as a mechanism that buffers transactions to avoid SSD latency. In some scenarios, reducing the actual size of the SSD for ASM is useful to both ASM failgroup mirroring and overprovisioning for the purpose diminishing the latency effect caused by writing on SSD devices, in particular, for updates, deletes, and inserts on used blocks. It is also important to remember that in most SSD technologies the hot sectors of the drives or volumes are automatically cached, i.e., in Oracle technology there would also be in the buffer cache or the shared pool.

This model requires the setting of the ASM_PREFERRED_READ_FAILURE_GROUPS accordingly, but does not require the implementation and application of customized MIRRORHOT templates.

Combining both Models: Hybrid Tuning Storage Optimization Model (HTSOM)

The two models can be combined, since there is no constraint that implies so, in general. However, this should be accomplished following the recommendation of Oracle Sun or any third-party vendor accordingly.
This model will require both the setting of the ASM_PREFERRED_READ_FAILURE_GROUPS and the implementation and application of customized MIRRORHOT templates, as needed for optimized database object allocation and storage performance. When using the combined model, the goal is to set the MIRRORHOT data files on the ASM Preferred Read Failure Groups to implement an optimal storage configuration.

Scenarios and Strategies

The following scenarios are both applicable to OLTP and Data Warehousing, as well as hybrid databases.

Working with Different Disk Array Pool Types

When this scenario occurs, it is important to establish the ASM preferred read disk groups, which enables a transparent way to establish storage hierarchies.

Working with Large Logical Database Objects

The models presented here allows for the tuning of the physical model at the storage level in a highly customized fashion, allowing for an optimal allocation. Thus, the Hot-or-Cached Storage Optimization Model presented allows for a large logical database object to have the best residence for optimal performance.

Working with Oracle Partitioning

When using partitioning technology, the following operations are possible or likely to occur:

Creating a new partition onto a tablespace using a MIRRORHOT ASM datafile, accordingly.

Moving a partition to a hot ASM datafile tablespace, onto an ASM datafile previously associated with a MIRRORHOT template and ASM data file.

Exchanging a partition when it becomes current to be associated with the appropriate MIRRORHOT tablespace ASM datafile.

Working with RMAN Backup and Recovery

RMAN can serve to manage ASM data files and move them between disk groups, as needed to set these files onto Oracle ASM Preferred Read Disk Groups or onto a MIRRORHOT ASM data file.

Using Oracle ASM ACFS and ADVM

The usage of these technologies should be enabled in order to apply Intelligent Data Placement (IDP) API to attain optimal storage performance tuning. This is done via the init.ora ASM parameters COMPATIBLE.RDBMS, COMPATIBLE.ASM, and COMPATIBLE.ADVM. By setting these parameters to 11.2 or later, the Oracle ASM Intelligent Data Placement API will be fully available for dynamic usage either manually or in an automated fashion. However, the compatibility can also be explicitly set when creating or altering a disk group via the SQL*Plus command line interface. Consequently, ASM volumes built on specific disk groups will inherit the settings and properties established for those disk groups, and the principles presented can be extended and applied at the ACFS and ADVM levels.

CREATE DISKGROUP DATA1
NORMAL REDUNDANCY
FAILGROUP ADNFG1
DISK '/dev/sda1', '/dev/sda2','/dev/sda3', '/dev/sda4'
FAILGROUP ADNFG2
DISK '/dev/sdb1', '/dev/sdb2','/dev/sdb3', '/dev/sdb4'
ATTRIBUTE 'compatible.rdbms' = '11.2',
'compatible.asm' = '11.2';

ALTER DISGROUP DATA1 ATTRIBUTE 'COMPATIBLE.ASM' = '11.2';


The compatible attributes can only be advanced to a more recent version.

Maintenance Operations

Maintenance operations on ASM IDP for storage optimization are capable to either maximize or optimize performance through the usage of IDP API. For instance, the activation of a new large table partition by setting the newly created partition into a Oracle ASM data file residing on a hot sector will significantly have a positive performance tuning effect over the physical model and the top active SQL statements.
Heating Up Database Logical Objects

Database logical objects can be heated up at creation time by specifying the appropriate hot tablespace. But they can also be altered and moved to the appropriate tablespace using the MIRRORHOT ASM data file. Likewise, this can happen as well when using object partitions at creation, exchange, or move time.

Changing Logical Objects Tiers

This can happen, for instance, when a highly active table or index partition utilization level changes due to an event, such as, for instance, the start of a new fiscal year, which implies that the current partition will no longer be used as often, such that the next current partition will gain a higher relevance. Therefore, an object partition, such as a table partition, local or global index partition or materialized view partition, could well be downgraded to a lower performance tier when it is affected by such an event. This is in order to take advantage of the partition pruning effect in order to use only the hot sectors when appropriate.

Moving ASM Data Files

Moving ASM data files across the ASM disk groups is usually a costly task, and normally not recommended unless strictly necessary. However, this can be accomplished using the ASMCMD utility CLI interface, via the usage of RMAN DATAFILE COPY command, or via the PL/SQL utilizing the the DBMS_FILE_TRANSFER.COPY_FILE and DBMS_FILE_TRANSFER.PUT_FILE methods or the DBMS_DISKGROUP package.



RMAN> SQL 'ALTER DATABASE DATAFILE 7 OFFLINE';
Likewise, for Oracle Managed Files (OMF), the DBA could use a similar command to the one shown below:

RMAN> COPY DATAFILE
'+DATA/ADN3/DATAFILE/tools.256.765689507'
TO '+DATA/ADN7';


For non-Oracle managed files, the DBA must include both the directory and file name when declaring the target in the COPY DATAFILE clause:


RMAN> COPY DATAFILE
'+DATA/ADN3/DATAFILE/tools_01.dbf'
TO '+DATA1/ADN7/DATAFILE/tools_01.dbf';


The DBA will need to rename the file accordingly regardless of the method used to move the file, as shown below:

Using RMAN
RMAN> SWITCH DATAFILE
'+DATA/ADN3/DATAFILE/tools.256.765689507'
TO COPY;

Using SQL

SQL> ALTER DATABASE RENAME FILE
2 '+DATA/ADN3/DATAFILE/tools.256.765689507'
3 TO '+DATA1/ADN7/DATAFILE/tools.264.765695815';


When applicable, perform the appropriate recovery:

Recover the new ASM database file:

SQL> RECOVER DATAFILE '+DATA1/ADN7/DATAFILE/tools.264.765695815';
Bring the new ASM database file ONLINE


SQL> ALTER DATABASE DATAFILE '+DATA1/ADN7/DATAFILE/tools.264.765695815' ONLINE;

Delete the old ASM database file from its original location

SQL> ALTER DISKGROUP DATA DROP FILE '+DATA/ADN3/DATAFILE/tools.256.765689507';

Overprovisioning SSD Drives

Overprovisioning is normally already built into the SSD technology to compensate for the lack of symmetry between reads and writes. Statistics show that under a typical 65% reads and 35%, there is a trend for 10:1 r/w performance ratio, meaning that reads are about 10 times faster than writes. In many database scenarios, the latency is caused because a block involved in a transaction must be completely erased before it can be rewritten. Performance on a SSD drive various based on the drive level of utilization. Thus, it will be excellent Fresh Out of Box (FOB), it will decay and go into a transient state (Transition) based on the manufacturer and will attain a Steady State with maturity. Therefore, when building pools of dual drives, i.e., hybrid arrays, the architect should follow the vendor's specific recommendations, and not assume that pros or cons apply in exactly the same way for all vendors.

Overprovisioning the drive means that it will have a cache allowing the buffering (holding of old blocks and writing of new blocks) to minimize the SSD latency. This cache can be implemented in Flash technology or using the same SSD NAND semi-conductor technology as well.

Overprovisioning on Hybrid Array Pools and SSD Pools

Overprovisioning is usually accomplished by adding a Flash cache to the SSD pool to overcome the latency of rewriting a block, which is much slower than originally reading it. This is usually part of the controller cache. In these architectures, flash memory is used by the controller for the purpose of assisting in garbage collection and alleviating some SSD write-related latency issues.

Overprovisioning on HHDD (Flash and HDD)


In this scenario a flash cache is used to buffer a certain amount of transactional data as it is written. This is built-in in many appliances, such as on EMC2 and NetApp storage appliances and in some systems such as the newest Apple Mac's and Oracle Sun Exadata database machines.

Integration with Other Technologies

The following technologies could be associated to further improve performance in conjunction with the storage optimization model presented, namely:
  • Integration with Cache Management
  • Using Multiple-block-size Database
  • Using Coherence (Oracle Data Grid)
  • Using Oracle Times-Ten.
Recently, Hitachi introduced the LG Hydrive a Dual (Hybrid) drive using an Optical Disk Drive (ODD) and Flash memory as part of its cache component.

Expected Results

Studies based on historic measurement of Jim Gray's 5 minute rule up to today's trends, show that not only we can attain a 25% improvement with ASM disk groups based conventional HDD arrays, but appliance-driven solutions using dynamic storage management over hybrid solutions have reached at least 62% improvement and up to 1000% times, i.e., 10 times better performance, based on 100% solid state disk groups. The study was held by IBM on OLTP databases using appliance-driven storage management and custom algorithm for tuning. This study did not yet utilized the Hot-or-Cached Storage Optimization model, and were fully guided by hot block level tuning. Therefore, the possibilities are endless and the market is opened to mix, match, and customize every possible solution.

There will be a significant cost reduction based on the power consumption with the usage of SSD technology, and the optimal usage of IOPS utilization.

Concluding Remarks
With Oracle11g Release 2, enterprise storage can now benefit of ASM elasticity, and the ability to customize storage optimal performance for a specific database, by engineering ASM production maintenance practices that optimize object allocation onto hot sectors, which use tablespaces related to ASM data files utilizing MIRRORHOT templates.

At the time when this paper is written, due to econometric reasons, hybrid models using both SSDs and HDDs technologies are more practical, very convenient, and highly cost-effective for most industries and enterprise sizes. While SSD technology can reduce the cost associated with energy due to various factors, until the price of solid state technologies drops significantly, the hybrid dual disk drive, using SSD/Flash for performance and HDD for capacity, will remain the de facto standard to implement ASM-based optimal storage architectures and infrastructures. Although, there are some industry-wide reliability pros and cons on SSD disks, it is expected that sometime in the future, when economics of scale will allow so, SSDs will become the predominant storage technology used with ASM.

In general, Oracle ASM IDP can improve I/O performance with dynamic load balancing. Oracle DBAs and database architects should plan any reorganization tasks for optimal performance giving greater relevance to the object-level rather than ASM file level, while also considering preferred read disk groups as a transparent method to establish storage hierarchies.

N.B.  The enitre paper can be read at and retrieved from my new website under construction, namely:

http://www.adnresearch.com/resources/2011_726_noriega_ppr.pdf