Operating System Tuning

The Windows NT operating system version 4.0 provides a rich set of objects and counters that provide detailed information regarding performance from the operating system perspective. The objects and counter are accessible through the NT Performance Monitor application, and are the key to analyzing and identifying operating system-level performance bottlenecks. Hundreds of objects and counters are provided, and this mass quantity of information can be overwhelming at first glance. This section will identify the objects and counters that are useful for performance analysis for a database server, and detail a reasonably simple approach to formulating tuning initiatives based upon this information.

General settings

NT provides settings that allow the O/S to "tune itself" to favor particular types of applications and usage. Since NT supports a wide range of network services (such as file serving, web-serving, application services, etc.), these settings allow the administrator to instruct the operating system to allocate resources to optimize their usage for a particular type of service. For function as an ORACLE database server, the following settings should be verified or enabled:

1) Optimize for network throughput. This setting instructs NT to allocate resources such that they are more available and useable to application processes that execute locally on the server. This setting defaults to "Optimize for file sharing", and modifying this setting will typically generate a performance improvement of 5-10% in the ORACLE instance. Note that this setting requires a reboot before it takes effect. This setting can be found in the following location:

START -> Settings -> Control Panel -> Network -> Services -> Server -> Properties -> Maximize throughput for network applications.

2) Minimize foreground boost. This setting allows NT Servers that perform additional duties as user workstations to allocate more resources to interactive sessions that are logged into the console, and defaults to 50%. Assuming this box is a dedicated database server, then this setting should be set such that interactive logins are not given priorities higher than the ORACLE services. Due to NT's architecture, certain administrative tasks may require console login, and it is undesirable for the database end-users to experience a performance degradation each time an interactive login occurs. This setting essentially minimizes a 5-10% degradation during interactive login sessions. Note that this setting requires a reboot before it takes effect. This setting can be found in the following location:

START -> Settings -> Control Panel -> System -> Performance -> Foreground boost = none.

3) DiskPerf. This setting causes disk performance objects and counters to be loaded at boot time. Without this setting toggled on, examining disk performance with NT Performance Monitor will result in all disk statistics being reported as zero. This setting incurs approximately a 5-10% performance overhead, but the information made available by enabling this setting will allow tuning analysis that it is otherwise not possible. If the disk performance information is used for bottleneck resolution, then having the information available to perform the analysis more than offsets the resulting performance loss, in the author's opinion. Note that this setting requires a reboot before it takes effect. Enable this setting from the command line by typing the following command:

diskperf -y (enables the objects and counters for logical and physical disks) or

diskperf -ye (enables the objects and counters for logical and physical disks plus individual disk drives in a stripped array)

4) Disk format. NT allows disk to be formatted as FAT partitions, NTFS partitions, or to remain unformatted (ie. raw disk). FAT is typically not used due to it's age and less robust features. NTFS is typically the default choice, although the choice of raw disk holds certain performance gains and manageability losses. An ORACLE instance can use raw disk on NT for database files, and the typical performance increase is approximately 15% (plus or minus 5%). However, raw disk under NT requires extremely careful space management and a detailed knowledge of disk level attributes (such as number and size of cylinders, etc.), and these tasks become even more complicated if the physical disks are contained in a disk array. Also, backups can become much more difficult to perform correctly and verify, often requiring specialized utilities. Given that disk drives are becoming more and more inexpensive, and that simply adding disk can provide greater than a 15% performance improvement in disk throughput and workload capacity, the author's opinion is that raw disk on NT should be avoided in favor of easier to manage NTFS formatting. Recalling Pareto's Rule, the required micro-management of space consumes DBA time that could be spent on other areas of tuning that are much more beneficial.

Identifying Operating System Bottlenecks

As stated previously, there's essentially only three areas of the operating system to check for performance bottlenecks: CPU, memory, and disk. NT Performance Monitor provides easy access the objects and counters storing the necessary information. Note that updating the GUI interface provided by PM is a fairly expensive process, although polling the object and counter information itself is reasonably inexpensive. For this reason, adopt one of the two following practices:

1) Run NT Performance Monitor from a remote workstation rather than at the console of the database server. PM can easily access performance information on a remote server, so remove the graphical workload from the database server to prevent the reported values from being over-inflated by the expensive graphical updates.

2) Install the DATALOG.EXE service on the database server, and control the service using the MONITOR.EXE command. The DATALOG.EXE (included in the NT Resource Kit) is the service-version of PM without the graphical components. It appears as "Monitor service" on the Control Panel-Services applet, it's memory and processor footprint are fairly light, and it accepts a performance monitor PMW file as it's data collection configuration information. The configuration information includes the objects and counters to poll, the polling interval, and the PM logfile in which to record the performance data. This logfile of information can then be analyzed offline. Performance data regularly collected with this method also proves useful for constructing usage growth profiles to anticipate future requirements.

The following approach is organized in order of resources easiest to analyze and fix addressed first, and will provide effective results. Check CPU first, since that resource is the easiest to monitor, and typically will be the least costly to rectify. If CPU is not the bottleneck, then examine memory. Although a bit more complicated to analyze, a memory shortage is still reasonably inexpensive to fix. Memory should be examined before disk because a memory shortage will inflate the volume of disk access. If memory is not the bottleneck, then examine disk. If none of these resources appear to be the bottleneck, then instance level initialization parameters should be carefully examined, with assistance from ORACLE support if needed.

1) Check CPU utilization.

Object: System

Counter: % Total Processor Time

Scrutinize average and maximum utilization. When the maximum utilitization reaches 100%, the CPU's are the bottleneck. If the CPU's consisently reach a maximum utilization of 100%, then perform additional analysis to determine what percentage of the time they are pegged at 100%. If they reach 100% only 1% of the time, but are below the 100% mark 99% of the time, then they are the bottleneck only 1% of the time. If CPU's are at or near 100% any portion of the time, then the DBA should plan ahead and budget the purchase of additional CPU resources before they are actually needed. If the CPU's never reach a max utilization of 100%, CPU is not the current bottleneck. Options for fixing a CPU bottleneck include:

- Add more CPU's

- Replace existing CPU's with processors that are faster and have larger processor caches.

Note that replacing 4 - 200Mhz Pentium Pro CPU's with 4 - 400Mhz Pentium Xeon processors is likely to result in greater overall improvement than adding 4 additional 200Mhz Pentium Pro processors, given NT's scaleability characteristics.

2) Check memory.

Object: Memory

Counter: Pages/sec

The simplest measurement of memory on NT is the rate of paging. Average and maximum paging should be examined, with additional analysis such as calculating the percentile divisions for particular time windows if maximum paging is excessive. Hit ratios for SGA memory structures (such as buffer cache) may be within the target range that should yield adequate performance (typically 80-95%, depending on a particular organization's requirements and expectations), but checking only SGA hit ratios may be misleading. NT virtual memory works such that the total memory available to an application (up to the maximum O/S limit) is the combination of physical memory and paged memory (ie. memory contents temporarily written to disk). Buffer cache hit ratio could read 95%, but if physical memory is being paged hundreds of times per second, a severe performance degradation will be experienced. Typically, the best balance on NT is 40-45% of the physical memory available on the server allocated to the SGA. A rate of memory paging greater than approximately 5 pages/second will result in exponential performance degradation. If SGA hit ratios are low, then increase the memory allocated to the necessary cache. If SGA hit ratios are high but the O/S is paging more than the suggested minimum, then add more physical memory to the server. Ideally, the DBA should plan ahead and budget the purchase of additional memory before hit ratios and paging becomes a significant bottleneck. If the SGA hit ratios are within the target range and O/S paging is minimal, then memory is not the bottleneck. Note that it is possible to allocate too much memory to the SGA: in this scenario, the SGA hit ratios will be excellent, but paging will occur at a rate than more than offsets the better hit ratios, thus making overall performance slower than it would be with lower SGA hit ratios. The objective for memory should be balancing between SGA hit ratios and O/S paging. Options for fixing a memory shortage include:

- Adjusting SGA memory allocations to balance the ORACLE hit ratios and O/S paging.

- Add physical memory to the server.

3) Check disk.

Object: Logical disk

Counters (for each logical disk): Disk transfers/sec and Avg Disk sec/transfer

Examine the workload imposed upon the disks used by the ORACLE instance. Logical disks, whether single physical disks or disk arrays, have a workload threshold that they can sustain without performance degradation; after that workload threshold is exceeded, read and write performance will degrade exponentially. Throughput is defined as the rate of data transfer, while workload is defined as the rate of I/O requests. In practice, the rate that data is read or written is relatively constant, given that the disk rotates at a constant speed. Because of the characteristic mechanical and rotational latencies, finding the data on disk can take as long or longer than performing the actual transfer, particularly in cases (like database systems) where the nature of disk I/O is more random than sequential. With this attribute in mind, it should be noted that workload is the most likely candidate to overload a disk's performance capacity. As the workload threshold (ie. transfers/second) of the disk or disk array is exceeded the amount of time to complete each request (seconds/transfer) will increase exponentially. Most hot-swappable SCSI disk is capable of sustaining around 60-80 transfers/second, depending upon the particular model and vendor. (Contact your hardware vendor to obtain these attributes specific to your particular disk drive models.) With this information available for each disk, the approximate workload threshold can be calculated for disk arrays visible to the O/S as single logical disks. (Please see the article on RAID in this section for the equations to approximate this workload threshold.) For instance, if we have a disk array that has a workload threshold of 180 transfers/second, the seconds/transfer should be in the millisecond range when the actual workload is below this limit. When this workload threshold becomes drastically exceeded, the seconds/transfer will increase into the tens or even hundreds of milliseconds. Unfortunately, I/O-bound situations are typically the most difficult and costly to fix. Options include:

- Balancing the disk I/O over multiple disks or disk arrays/controllers using information obtained from the ORACLE instance's BSTAT/ESTAT reports (please see the article on instance tuning for more information).

- Add physical disks individually or to arrays to increase the workload threshold.

- Add more independent disk arrays and controllers to the database server.

- Change the selected RAID technology (RAID5 to RAID1 or RAID10, non-RAID to RAID0, etc.).

- Replace disks or arrays with faster disk and/or array controllers.

- If SGA hit ratios are below ideal, add a significant volume of memory to reduce disk usage. Note that the higher the SGA hit ratios already are, the less effective this approach will be.

- Perform additional application tuning to minimize disk I/O requirements. Note that accessing a table via an index, for instance, will typically require much less disk I/O than access via a full-table scan.