• Earn real money by being active: Hello Guest, earn real money by simply being active on the forum — post quality content, get reactions, and help the community. Once you reach the minimum credit amount, you’ll be able to withdraw your balance directly. Learn how it works.

Perl [Perl] Duplicate File Finder

Status
Not open for further replies.

sQuo

~ KillmeMories ~
Shadow
User
Joined
Oct 16, 2011
Messages
5,851
Reputation
0
Reaction score
22,904
Points
688
Credits
0
‎13 Years of Service‎
24%
Here's a duplicate file finder perl script that i've modified a few times. It will look through files of a similar size and compare them by looking through their MD5 hashes to find duplicate files. This is VERY useful if you have copies of music or something within a same directory but with a different name, because this will detect those copies, and you can delete them manually to free up some space on your hard drive.

I didn't make it to automatically remove files just so that you have the option yourself to decide whether or not to delete them.

Code:
>#!/usr/bin/perl -w

use strict;
use File::Find;
use Digest::MD5;

my %files;
my $wasted = 0;
find(\&check_file, $ARGV[0] || ".");

local $" = "\n";
foreach my $size (sort {$b  $a} keys %files) {
 next unless @{$files{$size}} > 1;
 my %md5;
 foreach my $file (@{$files{$size}}) {
   open(FILE, $file) or next;
   binmode(FILE);
   push @{$md5{Digest::MD5->new->addfile(*FILE)->hexdigest}},$file;
 }
 foreach my $hash (keys %md5) {
   next unless @{$md5{$hash}} > 1;
   print "\n";
   print "\n";
   print "($size bytes) Duplicate Files:\n";
   print "@{$md5{$hash}}\n";
   print "\n";
   $wasted += $size * (@{$md5{$hash}} - 1);
 }
}

1 while $wasted =~ s/^([-+]?\d+)(\d{3})/$1,$2/;
print "\n";
print "######################################################\n";
print "                                                    \n";
print "  You have $wasted bytes total in duplicate files   \n";
print "                                                   \n";
print "######################################################\n";
print "\n";

sub check_file {
 -f && push @{$files{(stat(_))[7]}}, $File::Find::name;
}
 
Status
Not open for further replies.
Back
Top