Monday, February 15, 2010

Linux fdupes: Get Rid (Delete) Of Double Duplicate Files In Directory

How do I find duplicate files in a given set of directories and delete them using a shell script or a command line options?

How do I get rid of double duplicates files stored in ~/foo and /u2/foo directory?

You need to use a tool called fdupes. It will searche the given path for duplicate files. Such files are found by comparing file sizes and MD5 signatures, followed by a byte-by-byte comparison. fdupes is a nice tool to get rid of duplicate filesز


Install fdupes
Type the following command under Debian / Ubuntu Linux:

# apt-get install fdupes

Type the following command under Redhat / RHEL / Fedota / CentOS Linux, enter (turn on rpmforge repo before running the following yum command):


# yum install fdupes


How Do I Use fdupes?
Find duplicate files in /etc/ directory, enter:


# fdupes /etc


Sample outputs:
/etc/vimrc
/etc/virc

How Do I Delete Unwanted Files?
You can force fdupes to prompt you for files to preserve, deleting all others (use this with care otherwise you may loss data):


# fdupes -d /etc


Sample outputs:
[1] /etc/vimrc
[2] /etc/virc

Set 1 of 1, preserve files [1 - 2, all]: 1

   [+] /etc/vimrc
   [-] /etc/virc

How Do Recursively Search Directory?
You can recursively search every directory given follow subdirectories encountered within the -r option, enter:


# fdupes -r /dir1


How Do I Find Dupes In Two Directories?
Type the command as follows:


# fdupes /dir1 /dir2


OR

# fdupes -r /etc /data/etc /nas95/etc


How Do I See Size Of Duplicate Files?
Type the following command with the -S option:


# fdupes -S /etc


Sample outputs:
1533 bytes each:
/etc/vimrc
/etc/virc

Further readings:
  • man page fdupes

4 comments:

  1. JAK, Good post, but is it like Linux/Unix Command: "uniq"?

    ReplyDelete
  2. No. Uniq does this on lines not on files.

    Check the examples section of http://www.opengroup.org/onlinepubs/000095399/utilities/uniq.html

    ReplyDelete
  3. Find duplicate files by content!

    Duplicate Files Deleter has the MD5 search engine which allows the program to search for duplicate files by content, regardless of other match criteria. It would be helpful, for example, when two identical mp3 tracks or video files have different names.


    Intuitive and attractive user interface
    Comprehensive search and result filtering features
    Powerful multi-threaded duplicate file classification engine
    Multiple paths can be searched in a single pass
    Network enabled - scans both local and network file systems
    Reports can be exported in HTML, CSV, and XML formats
    XML report exports can be re-imported at a later time
    Project-based - load and save entire projects with ease
    Integrated command line support for batching, scheduling
    SmartMark™ technology helps you to identify duplicates for processing
    Duplicate file management - safer moving, zipping, & deleting of duplicate files
    Integrated image preview panel for quick visual comparisons
    Printing and print preview
    Integrated duplicate file scan summary report
    A file checksum calculator tool
    Integrated file types panel shows duplicates by file type
    Post-scan results filtering capability
    Highly customizable features and user interface
    Amazing enterprise-ready scalability, performance and efficiency
    LIFETIME updates FOR FREE

    ReplyDelete
  4. Your post is very beautiful. We want more post about this topic....
    If you want to remove/ delete this file from your hard drive so download quickly “Duplicate Files Delete” from http://DuplicateFilesDeleter.com It may be able to be removed/ delete this file from your hard drive.

    ReplyDelete