Directory sizes and filesystems
Monday, 03. 19. 2012 – Category: sw
kyero@fs02 ~/log $ du -hs .
1.7M
kyero@fs02 ~/log $ ls | wc -l
24
kyero@fs02 ~/log $ ls -ild .
909313 drwxr-xr-x 2 kyero kyero 1630208 Mar 19 05:25 .
This log directory was allowed to grow indefinitely (ahem) until I trimmed it this morning. Now it is mostly empty (1.7MB), but the directory entry is still huge (at 1.6MB, compared to the default 4kB). I had a vague idea of why this happens but realised I was unclear on the specifics so it was time for The Learning.
On traditional filesystems (here, FFS on FreeBSD and ext2 on GNU/Linux) directories look much like files, just with special content. They have an inode that points to either blocks on disk or to indirection blocks that themselves point to blocks (or to other indirection blocks, up to three levels of indirection). The contents of these directory blocks are reasonably straightforward: they’re a sequence of dirent
structures that generally contain a filename and that file’s inode number along with some small housekeeping data.
As a directory fills this sequence of dirent
s grows to fill all the direct blocks, then starts on the indirect blocks and so on. Each time the sequence crosses a block boundary the directory’s own size, rather than the total of the files it contains, grows by another block. Since filenames have variable length a single block will contain a variable number of dirent
s.
So far so good. What happens you delete a file from a directory? Other than freeing the actual file’s inode and respective blocks something must happen to its dirent
entry in the directory’s data. What doesn’t happen is that the dirent
is removed and all the ones after it get bumped towards the front of the list, which would lead to fewer overall directory blocks given enough deletions and ultimately smaller directories.
Why this doesn’t happen partly relates to the guarantees that telldir
and friends make to programs with the directory open: a file’s relative position in the sequence of dirent
will not change for any process that has the directory open. The deletion process can’t compact the directory itself without breaking that guarantee. Both FFS and ext2 blank out the file’s inode in the dirent
record (ext2 with zero, FFS with a special value). Subsequent directory operations will ignore this blanked dirent
, and processes which knew its sequence number via telldir
will error if they attempt operations on it.
To recover the disk space used by a directory that has ballooned and then contracted its simplest to simply delete it entirely and recreate it. ext2 has an option (-D
) in its filesystem check to compact the directory offline.
Interestingly this behaviour isn’t present on all filesystems, particularly those that use indexes instead of special files for directories: for example ZFS and HFS+. They can remove a file’s entry in the directory index yet still keep the remaining files appearing at their same positions by informed updates to the index.
Here is these four filesystem’s compared by
- Creating an empty directory
- Filling it with a million files
- Removing all but one of those files
The comparison shows how on ext2 and FFS the directory itself remains at its peak size after the test files are removed, whereas ZFS and HFS+ contract.
ext2
[build@centos-x86-64-build tmp]$ df -ih .
Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/sdb 1.3M 11 1.3M 1% /mnt/tmp
[build@centos-x86-64-build tmp]$ ls -ild .
2 drwxr-xr-x. 3 build build 4096 Mar 19 08:58 .
[build@centos-x86-64-build tmp]$ seq --format 'f%06g' 0 999999 | xargs touch
[build@centos-x86-64-build tmp]$ df -ih .
Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/sdb 1.3M 977K 304K 77% /mnt/tmp
[build@centos-x86-64-build tmp]$ ls -ild .
2 drwxr-xr-x. 3 build build 23224320 Mar 19 08:59 .
[build@centos-x86-64-build tmp]$ seq --format 'f%06g' 0 999998 | xargs rm
[build@centos-x86-64-build tmp]$ df -ih .
Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/sdb 1.3M 12 1.3M 1% /mnt/tmp
[build@centos-x86-64-build tmp]$ ls -ild .
2 drwxr-xr-x. 3 build build 23224320 Mar 19 09:02 .
FFS
[lemon@zest ~/tmp]$ df -ih .
Filesystem Size Used Avail Capacity iused ifree %iused Mounted on
/dev/mirror/gm0s1a 19G 10G 7.5G 58% 587k 2.1M 22% /
[lemon@zest ~/tmp]$ ls -ild .
521826 drwxr-xr-x 2 lemon lemon 512 Mar 19 14:50 .
[lemon@zest ~/tmp]$ jot -w "f%06d" 1000000 0 | xargs touch
[lemon@zest ~/tmp]$ df -ih .
Filesystem Size Used Avail Capacity iused ifree %iused Mounted on
/dev/mirror/gm0s1a 19G 10G 7.5G 58% 1.6M 1.1M 60% /
[lemon@zest ~/tmp]$ ls -ild .
521826 drwxr-xr-x 2 lemon lemon 16000512 Mar 20 01:10 .
[lemon@zest ~/tmp]$ jot -w "f%06d" 999999 0 | xargs rm
[lemon@zest ~/tmp]$ df -ih .
Filesystem Size Used Avail Capacity iused ifree %iused Mounted on
/dev/mirror/gm0s1a 19G 10G 7.5G 58% 587k 2.1M 22% /
[lemon@zest ~/tmp]$ ls -ild .
521826 drwxr-xr-x 2 lemon lemon 16000512 Mar 20 01:44 .
ZFS
[lemon@zest /mnt/ark/tmp]$ df -ih .
Filesystem Size Used Avail Capacity iused ifree %iused Mounted on
ark 207G 686M 206G 0% 58k 432M 0% /mnt/ark
[lemon@zest /mnt/ark/tmp]$ ls -ild
5 drwxrwsr-x 2 root wheel 2 Mar 20 06:18 .
[lemon@zest /mnt/ark/tmp]$ jot -w "f%06d" 1000000 0 | xargs touch
[lemon@zest /mnt/ark/tmp]$ df -ih .
Filesystem Size Used Avail Capacity iused ifree %iused Mounted on
ark 207G 859M 206G 0% 1.1M 432M 0% /mnt/ark
[lemon@zest /mnt/ark/tmp]$ ls -ild .
5 drwxrwsr-x 2 root wheel 1000002 Mar 20 07:53 .
[lemon@zest /mnt/ark/tmp]$ jot -w "f%06d" 999999 0 | xargs rm
[lemon@zest /mnt/ark/tmp]$ df -ih .
Filesystem Size Used Avail Capacity iused ifree %iused Mounted on
ark 207G 727M 206G 0% 58k 432M 0% /mnt/ark
[lemon@zest /mnt/ark/tmp]$ ls -ild
5 drwxrwsr-x 2 root wheel 3 Mar 20 09:27 .
HFS+
[lemon@further tmp] 0 $ df -ih .
Filesystem Size Used Avail Capacity iused ifree %iused Mounted on
/dev/disk1 233Gi 185Gi 48Gi 80% 48506927 12484688 80% /
[lemon@further tmp] 0 $ ls -ild .
12149971 drwxr-xr-x 2 lemon staff 68 19 Mar 16:06 .
[lemon@further tmp] 0 $ jot -w "f%06d" 1000000 0 | xargs touch
[lemon@further tmp] 0 $ df -ih .
Filesystem Size Used Avail Capacity iused ifree %iused Mounted on
/dev/disk1 233Gi 185Gi 48Gi 80% 48492686 12498929 80% /
[lemon@further tmp] 0 $ ls -ild .
12149971 drwxr-xr-x 2 lemon staff 34000068 19 Mar 17:32 .
[lemon@further tmp] 0 $ jot -w "f%06d" 999999 0 | xargs rm
[lemon@further tmp] 0 $ df -ih .
Filesystem Size Used Avail Capacity iused ifree %iused Mounted on
/dev/disk1 233Gi 185Gi 48Gi 80% 48524492 12467123 80% /
[lemon@further tmp] 0 $ ls -ild .
12149971 drwxr-xr-x 2 lemon staff 102 19 Mar 19:08 .
Recent articles
- Docker, SELinux, Consul, Registrator
(Wednesday, 04. 29. 2015 – No Comments) - ZFS performance on FreeBSD
(Tuesday, 09. 16. 2014 – No Comments) - Controlling Exim SMTP behaviour from Dovecot password data
(Wednesday, 09. 3. 2014 – No Comments) - Heartbleed OpenSSL vulnerability
(Tuesday, 04. 8. 2014 – No Comments)
Archives
- April 2015
- September 2014
- April 2014
- September 2013
- August 2013
- March 2013
- April 2012
- March 2012
- September 2011
- June 2011
- February 2011
- January 2011
- October 2010
- September 2010
- February 2010
- September 2009
- August 2009
- January 2009
- September 2008
- August 2008
- July 2008
- May 2008
- April 2008
- February 2008
- January 2008
- November 2007
- October 2007
- September 2007
- August 2007
- December 2006
- November 2006
- August 2006
- June 2006
- May 2006
- March 2006
- February 2006
- January 2006
- December 2005
- November 2005
- October 2005