I want to get an alert when my disk is full (parsing df output).
Sadly, parsing the output of df really is the most reliable way to determine how full a disk is, on most operating systems. However, please note that this is a "least bad" answer, not a "best" answer. Parsing any command-line reporting tool's output in a program is never pretty. The purpose of this FAQ is to try to describe all the problems this approach is known to encounter, and work around them.
The first, biggest problem with df is that it doesn't work the same way on all operating systems. Unix is divided largely into two families -- System V and BSD. On BSD-like systems (including Linux, in this case), df gives a human-readable report:
~$ df Filesystem 1K-blocks Used Available Use% Mounted on /dev/sda2 8230432 3894324 3918020 50% / tmpfs 253952 8 253944 1% /lib/init/rw udev 10240 44 10196 1% /dev tmpfs 253952 0 253952 0% /dev/shm
However, on System-V-like systems, the output is completely different:
$ df /net/appl/clin (svr1:/dsk/2/clin/pa1.1-hpux10HP-UXB.10.20): 1301728 blocks -1 i-nodes /net/appl/tool-share (svr2:/dsk/4/dsk3/tool/share): 51100992 blocks 4340921 i-nodes /net/appl/netscape (svr2:/dsk/4/dsk3/netscape/pa1.1-hpux10HP-UXB.10.20): 51100992 blocks 4340921 i-nodes /net/appl/gcc-3.3 (svr2:/dsk/4/dsk3/gcc-3.3/pa1.1-hpux10HP-UXB.10.20): 51100992 blocks 4340921 i-nodes /net/appl/gcc-3.2 (svr2:/dsk/4/dsk3/gcc-3.2/pa1.1-hpux10HP-UXB.10.20): 51100992 blocks 4340921 i-nodes /net/appl/tool (svr2:/dsk/4/dsk3/tool/pa1.1-hpux10HP-UXB.10.20): 51100992 blocks 4340921 i-nodes /net/home/wooledg (/home/wooledg ): 658340 blocks 87407 i-nodes /net/home (auto.home ): 0 blocks 0 i-nodes /net/hosts (-hosts ): 0 blocks 0 i-nodes /net/appl (auto.appl ): 0 blocks 0 i-nodes /net/vol (auto.vol ): 0 blocks 0 i-nodes /nfs (-hosts ): 0 blocks 0 i-nodes /home (/dev/vg00/lvol5 ): 658340 blocks 87407 i-nodes /opt (/dev/vg00/lvol6 ): 623196 blocks 83075 i-nodes /tmp (/dev/vg00/lvol4 ): 86636 blocks 11404 i-nodes /usr/local (/dev/vg00/lvol9 ): 328290 blocks 41392 i-nodes /usr (/dev/vg00/lvol7 ): 601750 blocks 80228 i-nodes /var (/dev/vg00/lvol8 ): 110696 blocks 14447 i-nodes /stand (/dev/vg00/lvol1 ): 110554 blocks 13420 i-nodes / (/dev/vg00/lvol3 ): 190990 blocks 25456 i-nodes
So, your first obstacle will be recognizing that you may need to use a different command depending on which OS you're on (e.g. bdf on HP-UX); and that there may be some OSes where it's simply not possible to do this with a shell script at all.
For the rest of this article, we'll assume that you've got a system with a BSD-like df command.
The next problem is that the output format of df is not consistent across platforms. Some plaforms use 6 columns of output. Some use 7. Some platforms (like Linux) use 1-kilobyte blocks by default when reporting the actual space used or available; others, like OpenBSD or IRIX, use 512-byte blocks by default, and need a -k switch to use kilobytes.
Worse, often a line of output will be split into multiple lines on the screen. For example (Linux):
Filesystem 1K-blocks Used Available Use% Mounted on ... svr2:/dsk/4/dsk3/tool/i686Linux2.4.27-4-686 35194552 7856256 25550496 24% /net/appl/tool
If the device name is sufficiently long (very common with network-mounted file systems), df may split the output onto two lines in an attempt to preserve the columns for human readability. Or it may not... see, for example, OpenBSD 4.3:
~$ df Filesystem 512-blocks Used Avail Capacity Mounted on /dev/wd0a 253278 166702 73914 69% / /dev/wd0d 8121774 6904178 811508 89% /usr /dev/wd0e 8121774 6077068 1638618 79% /var /dev/wd0f 507230 12 481858 0% /tmp /dev/wd0g 8121774 5653600 2062086 73% /home /dev/wd0h 125253320 116469168 2521486 98% /export ~$ sudo mount 192.168.2.5:/var/cache/apt/archives /mnt ~$ df Filesystem 512-blocks Used Avail Capacity Mounted on /dev/wd0a 253278 166702 73914 69% / /dev/wd0d 8121774 6904178 811508 89% /usr /dev/wd0e 8121774 6077806 1637880 79% /var /dev/wd0f 507230 12 481858 0% /tmp /dev/wd0g 8121774 5653600 2062086 73% /home /dev/wd0h 125253320 116469168 2521486 98% /export 192.168.2.5:/var/cache/apt/archives 1960616 1638464 222560 88% /mnt
Most versions of df give you a -P switch which is intended to standardize the output... sort of. Older versions of OpenBSD still split lines of output even when -P is supplied, but Linux will generally force the output for each file system onto a single line.
Therefore, if you want to write something robust, you can't assume the output for a given file system will be on a single line. We'll get back to that later.
You can't assume the columns line up vertically, either:
~$ df -P Filesystem 1024-blocks Used Available Capacity Mounted on /dev/hda1 180639 93143 77859 55% / tmpfs 318572 4 318568 1% /dev/shm /dev/hda5 90297 4131 81349 5% /tmp /dev/hda2 5763648 699476 4771388 13% /usr /dev/hda3 1829190 334184 1397412 20% /var /dev/sdc1 2147341696 349228656 1798113040 17% /data3 /dev/sde1 2147341696 2147312400 29296 100% /data4 /dev/sdf1 1264642176 1264614164 28012 100% /data5 /dev/sdd1 1267823104 1009684668 258138436 80% /hfo /dev/sda1 2147341696 2147311888 29808 100% /data1 /dev/sdg1 1953520032 624438272 1329081760 32% /mnt /dev/sdb1 1267823104 657866300 609956804 52% /data2 imadev:/home/wooledg 3686400 3336736 329184 92% /net/home/wooledg svr2:/dsk/4/dsk3/tool/i686Linux2.4.27-4-686 35194552 7856256 25550496 24% /net/appl/tool svr2:/dsk/4/dsk3/tool/share 35194552 7856256 25550496 24% /net/appl/tool-share
So, what can you actually do?
Use the -P switch. Even if it doesn't make everything 100% consistent, it generally doesn't hurt. According to the source code of df.c in Linux coreutils, the -P switch does ensure that the output will be on a single line (but that's only for Linux).
Set your locale to C. You don't need non-English column headers complicating the picture.
Consider using "stat --file-system --format=", if it's available. If portability is not an issue in your case, check the man page of the "stat" command. On many systems you'll be able to print the blocksize, total number of blocks on the disk, and the number of free blocks; all in a user-specified format.
Explicitly select a file system. Don't use df -P | grep /dev/hda2 if you want the results for a specific file system. Give df a directory name or a device name as an argument so you only get that file system's output in the first place.
~$ df -P / Filesystem 1024-blocks Used Available Capacity Mounted on /dev/sda2 8230432 3894360 3917984 50% /
Count words of output without respecting newlines. This is the workaround for lines being split unpredictably. For example, using a Bash array df_arr:
~$ read -d '' -ra df_arr < <(LC_ALL=C df -P /); echo "${df_arr[11]}" 50%
As you can see, we simply slurped the entire output into a single array and then took the 12th word (array indices count from 0). We don't care whether the output got split or not, because that doesn't change the number of words.
Removing the % sign, comparing the number to a specified threshold, scheduling an automatic way to run the script, etc. are left as exercises for you.
First discard the header information and read the data into the array.
{ read -r; read -rd '' -a disk_usage; } < <(LC_ALL=C df -Pk "$dir"; printf \\0); echo "${disk_usage[5]}" 39%
The GNU version of df allows you to filter the df output more efficiently. The output flag --output=avail will display only the available disk space for the designated device. E.g
df --output=source,target,avail,pcent /
This will display the source, target, available space and percentage of the root partition.