Skip to content

Instantly share code, notes, and snippets.

@zegomesjf
Forked from urubatan/remove_dups.rb
Created June 18, 2012 16:18

Revisions

  1. @urubatan urubatan created this gist May 24, 2012.
    33 changes: 33 additions & 0 deletions remove_dups.rb
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,33 @@
    require 'digest/sha1'
    require 'fileutils'
    directories = [
    "SOURCE DIR 1",
    "SOURCE DIR 2"
    ]
    files = {}
    directories.each do |dir_name|
    puts "Scanning Directory: #{dir_name} "
    Dir.glob("#{dir_name}/**/*.*") do |file_name|
    unless File.directory?(file_name)
    print "."
    dig = Digest::SHA1.hexdigest(File.open(file_name,'rb'){|f| f.read })
    arr = files[dig] || []
    arr << file_name
    files[dig] = arr
    end
    end
    puts ""
    end
    total_files = files.inject(0){|acum,val| acum + val[1].size}
    with_copies = files.select{|k,v| v.length > 1 }
    puts "#{files.size} different files"
    puts "#{with_copies.size} files with copies"
    puts "#{total_files = files.size} duplicates"
    FileUtils.mkdir_p "CopiesTrash"
    with_copies.each do |k,v|
    orig = v.pop
    puts "moving #{v.length} copie(s) of #{orig} to CopiesTrash"
    FileUtils.mv v, "CopiesTrash", :force => true
    puts ""
    end
    puts "Your directories are cleaned up of duplicated files, all the trash is in the CopiesTrash folder"