How To Find the Largest Files in your Unix system

I see that my Finding Large Files and Directories post is quite popular, yet there are a few more ways to simplify your search for the largest disk space consumers in your Unix system.

Make find command show file sizes

If you remember, the default way a find command reports results includes only the fully qualified (that means including the full path) filenames.

Now, if you look at a task of identifying the largest files, it's great if you can get a list of all the files bigger than some figure your specify, but what would be even better is to include the exact size of each file right into the output of the find command.

Here's how you do it: it's possible to specify which information about each file you'd like to see. Check out the find command man page for all the possibilities, but in today's example I'm using two parameters: %s means the size of a file in bytes and %f means the filename itself.

Let's say I want to get a list of all the files under /usr directory which are larger than 15Mb each, and show the exact size of each file. Here's how it can be done:

ubuntu$ find /usr -size +15M -printf "%s - %p\n"
39859372 - /usr/lib/vmware/webAccess/java/jre1.5.0_07/lib/rt.jar
35487120 - /usr/lib/vmware/bin/vmware-hostd
16351166 - /usr/lib/vmware/bin/vmplayer
38353296 - /usr/lib/vmware/hostd/libtypes.so
54366585 - /usr/lib/vmware/hostd/docroot/client/VMware-viclient.exe
92143616 - /usr/lib/vmware/isoimages/linux.iso
23494656 - /usr/lib/vmware/isoimages/windows.iso
47070920 - /usr/lib/libgcj.so.81.0.0
20890468 - /usr/share/fonts/truetype/arphic/uming.ttf
17733780 - /usr/share/icons/crystalsvg/icon-theme.cache
18597793 - /usr/share/myspell/dicts/th_en_US_v2.dat
45345879 - /usr/src/linux-source-2.6.22.tar.bz2

Just to help you refresh your mind, here's the explanation of all the parameters in the command line:

  • /usr is the directory where we'd like to find the files of interest
  • -size +15M narrows our interest to only the files larger than 15Mb
  • -printf "%s – %p\n" is the magic which shows the nice list of files along with their sizes.

Sort the list of files by filesize

Next really useful thing we could do is to sort this list, just so that we could see a nice ordered representation of how big each file is. It's very easily done by piping the output of the find command to a sort command:

ubuntu$ find /usr -size +15M -printf "%s - %p\n" | sort -n
16351166 - /usr/lib/vmware/bin/vmplayer
17733780 - /usr/share/icons/crystalsvg/icon-theme.cache
18597793 - /usr/share/myspell/dicts/th_en_US_v2.dat
20890468 - /usr/share/fonts/truetype/arphic/uming.ttf
23494656 - /usr/lib/vmware/isoimages/windows.iso
35487120 - /usr/lib/vmware/bin/vmware-hostd
38353296 - /usr/lib/vmware/hostd/libtypes.so
39859372 - /usr/lib/vmware/webAccess/java/jre1.5.0_07/lib/rt.jar
45345879 - /usr/src/linux-source-2.6.22.tar.bz2
47070920 - /usr/lib/libgcj.so.81.0.0
54366585 - /usr/lib/vmware/hostd/docroot/client/VMware-viclient.exe
92143616 - /usr/lib/vmware/isoimages/linux.iso

As you can see, the smallest files (just above 15Mb) are at the top of the list, and the largest ones are at the bottom.

Limit the number of files returned by find

The last trick I'll show you today is going to make your task even easier: why look at the pages of find commnand output, if you're after only the largest files? After all, your list can be much longer than the one shown above. To solve this little problem we'll pipe the output of all the commands to yet another unix command, tail.

tail command allows you to show only a specified number of lines of any standard input or Unix text file you point it to. By default, it strips the number of lines to 10, which can be enough for your purposes.

Here's how you can get a least of the 10 largest files under /usr:

ubuntu$ find /usr -size +15M -printf "%s - %p\n" | sort -n | tail
18597793 - /usr/share/myspell/dicts/th_en_US_v2.dat
20890468 - /usr/share/fonts/truetype/arphic/uming.ttf
23494656 - /usr/lib/vmware/isoimages/windows.iso
35487120 - /usr/lib/vmware/bin/vmware-hostd
38353296 - /usr/lib/vmware/hostd/libtypes.so
39859372 - /usr/lib/vmware/webAccess/java/jre1.5.0_07/lib/rt.jar
45345879 - /usr/src/linux-source-2.6.22.tar.bz2
47070920 - /usr/lib/libgcj.so.81.0.0
54366585 - /usr/lib/vmware/hostd/docroot/client/VMware-viclient.exe
92143616 - /usr/lib/vmware/isoimages/linux.iso

Show the largest 10 files in your Unix system

Now that you know all the most useful tricks, you can easily identify and show the list of the 10 largest files in your whole system. Bear in mind, that you should probably run this command with root privileges, as files in your system belong to various users, and a single standard user account will most likely have insufficient privileges to even list such files.

If you're trying to locate your largest files in Ubuntu, use the sudo command (assuming you have the sudo privileges to become root):

ubuntu$ sudo find / -size +15M -printf "%s - %p\n" | sort -n | tail

alternatively, just become root by doing something like this (you obviously should know the root password to do that):

$ su - root 

and then run the find command itself. Here's how the output looks on my Ubuntu desktop:

ubuntu$ find / -size +15M -printf "%s - %p\n" | sort -n | tail
39859372 - /usr/lib/vmware/webAccess/java/jre1.5.0_07/lib/rt.jar
45345879 - /usr/src/linux-source-2.6.22.tar.bz2
45356784 - /var/cache/apt/archives/linux-source-2.6.22_2.6.22-14.52_all.deb
45424028 - /var/cache/apt/archives/kde-icons-oxygen_4%3a4.0.2-0ubuntu1~gutsy1~ppa1_all.deb
47070920 - /usr/lib/libgcj.so.81.0.0
54366585 - /export/dist/vmware/server2b2/vmware-server-distrib/lib/hostd/docroot/client/VMware-viclient.exe
54366585 - /usr/lib/vmware/hostd/docroot/client/VMware-viclient.exe
92143616 - /export/dist/vmware/server2b2/vmware-server-distrib/lib/isoimages/linux.iso
92143616 - /usr/lib/vmware/isoimages/linux.iso
340199772 - /export/dist/vmware/server2b2/VMware-server-e.x.p-63231.x86_64.tar.gz

That's it for today, hope this helps! Please bookmark this post if you liked it, and leave comments if there are any questions!

Share and Enjoy

  • Facebook
  • Twitter
  • Delicious
  • LinkedIn
  • StumbleUpon
  • Add to favorites
  • Email
  • RSS
  • Pingback: How To Find a Location of a Directory in Unix | UNIX Tutorial: Learn UNIX()

  • ThomasU

    The quotes in the command lines prevent us from copy-pasting them effectively. After copy pasting, one should replace the quotes with "

  • Gleb Reys

    That's a valid point, Thomas. Thanks for the comment! I'll see if this can be fixed.

  • Gleb Reys

    This Thomas, this is fixed now. Thanks for letting me know!

  • scott

    The terminal on my jailbroken iPhone didn't have a man find, so this worked great! thanks

  • http://inpantuflas.com.ar Ezequiel

    Very usefull. thanks!

  • Matt

    After several search queries I finally found this. This has helped me so much. Thank you!