Saturday, April 14, 2007

A fast way to get users' disk usage on ext2/ext3 filesystems

The simplest way to get the per user disk usage is to mount the filesystem with the quota accounting (see "man 8 mount"). This is the most reliale way to be sure that the quota limits will be respected, since the accounting and the checks are performed synchronously by the filesystem itself. Unfortunately this adds a little overhead in the whole filesystem.

Another approach is to periodically check the disk usage with a script (typically with a cron job). The script should sum the size of each file and directory in the filesystem grouping them by the respective owners.

This program (e2fsusage) uses the second approach, but it doesn't read directly the files and directories, it analyze the filesystem metadata, performing a sequential scan of all the inodes.

In this way it bypass the process for the translation of the file/dir names into the respective inodes and it strongly reduces the total time to scan the entire filesystem. Moreover, since it evaluates the real allocated blocks of the filesystem using i_blocks, instead of i_size (see the struct ext2_inode in /usr/include/ext2fs/ext2_fs.h) it is able to detect the true size
occupied by each user (read it as: it is able to correctly handle sparse files).

No comments: