RAID and other fault-tolerance considerations
Purely electronic components such as CPU's, memory chips, array controller cards, etc., will typically fail during their first few weeks of operation (the "burn-in" period) if they are going to fail at all. Normally, failures in these types of devices are due to manufacturing defects, although factors such as heat or moisture can contribute if the equipment is not operated in a controlled environment. By far, the most likely components to fail in any computing system are the disk drives, followed by power supplies.
Many vendors now provide options for redundant, hot-swappable power supplies, even though power supplies do not have an extremely high probability of failure. Note, however, that even though the probability of failure is not extremely high, loss of a power supply does indeed unquestionably cause a system outage. Another factor to consider is that, since power supplies do not have an extremely high failure rate, a replacement power supply for your particular model of server may have an ordering lead time of several days or possibly even weeks. Given this realization, purchasing equipment with redundant power supplies, or at least purchasing a few replacement power supplies ahead of time, may minimize an interruption in service if this failure does indeed occur.
Unlike power supplies, disk drives are practically guaranteed to fail at some point in their operation. The failure may occur in the first two weeks of operation, or it may occur ten years from the purchase date, but the probability that they will fail at some point in time is very near 100%. Luckily, RAID technologies for most NT/Intel-based server hardware are readily available. Normally, the best solutions are array controller cards that handle the RAID manipulations on-board (relieving the operating system of performing this work with CPU cycles), thus presenting the array to the operating system merely as single logical disk drives. This approach is typically referred to as "hardware RAID", although many vendors now provide reasonably user-friendly NT-based utilities for configuring the arrays. Also, many vendors now provide disk drives that are hot-swappable, meaning that a failed drive can be removed and a new drive installed without power-down or other interruption in service. RAID configurations that do provide fault-tolerance include RAID1 (disk mirroring), RAID5 (stripping with parity), and RAID10 (mirrored stripe sets). Please see the performance section of this site for a detailed discussion of RAID and it's impact on performance.
Although recovery from disk failures is certainly possible without RAID (given that backup and recovery procedures are implemented), failures from drives containing database files (other than a mirrored copy of redo logs) will generate downtime. Depending on the database files impacted, the interruption may impact only a single application or it may impact the entire database instance. If these database files are stored on fault-tolerant disk arrays, then the failure of a single drive will typically not generate downtime, although a performance degradation will typically be experienced due to the loss of workload capacity and rebuild of information originally stored on the failed disk drive. Use of fault-tolerant disk arrays for database files (with the exception of multi-plexed redo logs), operating system files, and ORACLE software will prevent the downtime involved in exchanging a failed drive, retrieving backup tapes (which may be stored off-site), restoring database files, and performing database recovery operations. Finally, note that use of fault-tolerant RAID arrays is not a substitute for proven backup and recovery procedures.