Thursday, May 25, 2006

Ext3 and full data journaling

Ext3 is a stable and mature file system, offering a good balance of speed and reliability. But what many people do not realise is that the default journaling support is only for meta-data, not all data. Here is the relevant section from 'man tune2fs':

When the filesystem is mounted with journalling
enabled, all data (not just metadata) is committed
into the journal prior to being written into the
main filesystem.

When the filesystem is mounted with journalling
enabled, all data is forced directly out to the main
file system prior to its metadata being committed to
the journal.

When the filesystem is mounted with journalling
enabled, data may be written into the main filesys-
tem after its metadata has been committed to the
journal. This may increase throughput, however, it
may allow old data to appear in files after a crash
and journal recovery.

So the default mount option is with "journal_data_ordered". This is considered the fastest option, but at the expense of full data recovery in the event of a power outage etc. You can look at many of the tunable parameters with 'tune2fs -l /dev/hdx' or in my case as I'm using LVM:

# tune2fs -l /dev/mapper/VolGroup00-LogVol00
tune2fs 1.38 (30-Jun-2005)
Filesystem volume name:
Last mounted on:
Filesystem UUID: b0c69d9c-234f-444d-ba95-f979a4902f4d
Filesystem magic number: 0xEF53
Filesystem revision #: 1 (dynamic)
Filesystem features: has_journal ext_attr resize_inode dir_index filetype needs_recovery sparse_super large_file
Default mount options:
Filesystem state: clean
Errors behavior: Continue
Filesystem OS type: Linux
Inode count: 19005440
Block count: 19005440
Reserved block count: 950272
Free blocks: 5362926
Free inodes: 18381437
First block: 0
Block size: 4096
Fragment size: 4096
Reserved GDT blocks: 1024
Blocks per group: 32768
Fragments per group: 32768
Inodes per group: 32768
Inode blocks per group: 1024
Filesystem created: Wed Jul 6 20:23:44 2005
Last mount time: Thu May 25 10:03:33 2006
Last write time: Thu May 25 10:03:33 2006
Mount count: 226
Maximum mount count: -1
Last checked: Wed Jul 6 20:23:44 2005
Check interval: 0 ()
Reserved blocks uid: 0 (user root)
Reserved blocks gid: 0 (group root)
First inode: 11
Inode size: 128
Journal inode: 8
First orphan inode: 695882
Default directory hash: tea
Directory Hash Seed: 11dc53e2-545c-4880-a6a6-792557a40a3d
Journal backup: inode blocks

The value 'Default mount options: ' is empty meaning its only using the meta-data journaling. To set a new value here we run:

# tune2fs -o journal_data /dev/mapper/VolGroup00-LogVol00

NOTE: I've run this command on a mounted file system (in fact on the root file system / ) with no ill effects. However, if you are concerned about your data (and I suggest you always have backups) then only run this command on file systems after they are dismounted; either boot in rescue mode or from a bootable cd like Knoppix.

Also, we can edit our /etc/fstab to set the default mount option there by adding the "data=journal" option:

/dev/VolGroup00/LogVol00 / ext3 defaults,noatime,data=journal 1 1

Thats it. We now need to reboot the system (for / ) or remount (for any other file system) to begin taking advantage of full data journaling.

I've not noticed any performance degredation with "journal_data" and have heard reports that it is actually faster in some circumstances .

No comments: