Lew wrote:
> Tom Anderson wrote:
>> Databases typically store their data in a file [1]. ...
>> [1] Okay, so seriously heavyweight ones use disk extents/partitions
>> and bypass the filesystem; how much of a difference does that make?
>
> Anecdotally, on a large-scale test, above five-to-one.
Note the filesystem speed is one reason why a swap partition
is preferred to a swap file. (Windows uses a swap file,
C:\pagefile.sys).
The poor performance most people notice is when using
a DB on a filesystem and using default types and settings.
I haven't tested this myself, but from what I've read Lew's
anecdote is conservative. A poorly tuned system can be much,
much slower than a properly tuned one!
For example using Oracle on a FAT32 or ext3 filesystems
with RAID-1, when Oracle uses different block and stripe
sizes, journals writes to its files, and possibly mirrors
the tablespaces, is not going to be fast!
Performance also depends on the amount of writes versus reads
for your application. Filesystems are typically designed have
reads as fast as possible, and the difference can be very
noticeable. Other factors include caching, read-ahead, and RAID.
All can have dramatic affects on the read and especially write
speed. If using NAS or SAN network congestion is another factor.
Setting the DB cache sizes is im****tant as well, or the files
used for the DB will end up swapped to disk and the frequent
dirty page writes will slow down the whole system.
Filesystems can also be tuned to applications. For example
setting the block size equal to a DB block, and a stripe
size as a multiple of that, can greatly increase DB
performance that uses a filesystem. Creating a separate
partition and filesystem just for the DB files works better
than creating those files in (for example) the root file
system with lots of non-DB files. Using separate filesystems
for indexes and table data can also greatly improve
performance, especially if the different filesystems are
on different disk spindles.
Different types of filesystems work better with DBs than
others. Using any FAT filesystem for a serious DB will cause
poor performance. Using something like JFS or ext2 can
improve performance at the cost of safety (not all writes are
journaled).
Using a journaling filesystem or RAID-1 (or similar) works
best for DBs that don't do that internally anyway (but most do).
Either use a faster filesystem type and RAID-0 and let
the DBMS handle those issues, or tune the DBMS to not bother
journaling and/or mirroring.
I guess the bottom line is, for learning or for web site
DBs that are mostly read-only, a filesystem based DB should
be fine. For large OLTP systems you need to have the
DBA and the system administrator work together to tune
the disk layout and fileystem types used, and the DBMS.
If using some quality enterprise DBMS such as Oracle that
handles much of what the filesystem and RAID system can
do itself, using a raw disk volume (and properly tuned DB)
will result in the greatest performance for OLTP systems.
Finally keep in mind monitoring and backups: If using
a filesystem, the DB gets backed up (and disk space gets
monitored) by your normal filesystem tools and procedures.
If using raw disk volumes then you must use your DBMS
system to monitor and backup the data using a separate
procedure (often a monitor process must be kept running).
-Wayne


|