Skip to content

Instantly share code, notes, and snippets.

@shivabhusal
Last active September 3, 2025 04:19
Show Gist options
  • Save shivabhusal/79d10b296c91bd57b989be5fb092b19e to your computer and use it in GitHub Desktop.
Save shivabhusal/79d10b296c91bd57b989be5fb092b19e to your computer and use it in GitHub Desktop.
Find Duplicate Video Files While backing up your mobile videos to drives
# This script scans media files in its own directory and a subdirectory named "lot3".
# It compares the duration of files in "lot3" against files in the parent directory.
# If a file in "lot3" matches a parent file by name (case-insensitive) and their durations
# differ by no more than DURATION_TOLERANCE seconds, the "lot3" file is renamed to
# "delete_<original_name>" to mark it as a duplicate.
#
# Dependencies:
# - Uses macOS `mdls` command to read media duration metadata.
# - Requires Ruby standard library 'shellwords'.
#
# Constants:
# - FOLDER: Directory where the script resides.
# - LOT3: Subdirectory "lot3" inside FOLDER.
# - DURATION_TOLERANCE: Allowed difference in duration (seconds) to consider files as duplicates.
#
# Functions:
# - media_duration(path): Returns the duration (in seconds) of a media file using `mdls`.
#
# Workflow:
# 1. Collects all non-hidden, non-directory files in FOLDER, storing their durations.
# 2. Iterates over files in LOT3, skipping hidden and directory files.
# 3. For each LOT3 file, checks for a matching parent file by name and compares durations.
# 4. If durations are within tolerance, renames the LOT3 file to "delete_<original_name>".
# 5. Handles errors and skips renaming if the target name already exists.
#!/usr/bin/env ruby
# Target folder = the folder where this script resides
FOLDER = __dir__
LOT3 = File.join(FOLDER, "lot3")
# Allowed difference in duration (seconds)
DURATION_TOLERANCE = 0.5
require "shellwords"
def media_duration(path)
output = `mdls -name kMDItemDurationSeconds #{Shellwords.escape(path)} 2>/dev/null`.strip
return nil if output.empty? || !output.include?("=")
value = output.split("=").last.strip
value.to_f if value != "(null)"
end
# Collect parent files with their duration
parent_files = {}
Dir.children(FOLDER).each do |fname|
next if fname.start_with?(".") # skip hidden files
path = File.join(FOLDER, fname)
next if File.directory?(path) || path.start_with?(LOT3)
duration = media_duration(path)
next unless duration # skip if ffprobe can't read
parent_files[fname.downcase] = duration
end
# Check files in lot3
Dir.children(LOT3).each do |fname|
next if fname.start_with?(".") # skip hidden files
path = File.join(LOT3, fname)
next if File.directory?(path)
parent_duration = parent_files[fname.downcase]
next unless parent_duration
lot3_duration = media_duration(path)
next unless lot3_duration
diff = (lot3_duration - parent_duration).abs
if diff <= DURATION_TOLERANCE
new_name = "delete_" + fname
new_path = File.join(LOT3, new_name)
if !File.exist?(new_path)
begin
File.rename(path, new_path)
puts "Renamed duplicate: #{fname} -> #{new_name} (duration diff: #{diff.round(3)}s)"
rescue => e
warn "Failed to rename #{path}: #{e.message}"
end
else
warn "Skipped #{fname}, target already exists: #{new_name}"
end
end
end
# This script moves files from the "lot3" subdirectory into the current directory,
# renaming each file to include its creation timestamp (macOS birthtime).
# It skips files starting with "delete_" (assumed duplicates) and hidden files.
# If a file with the new name already exists, a counter is appended to avoid overwriting.
#
# Steps:
# - Iterates over each file in "lot3" (excluding directories, duplicates, and hidden files).
# - Extracts the file's creation time and formats it as "YYYYMMDD_HHMMSS".
# - Renames the file to "originalname_timestamp.ext".
# - If a naming conflict occurs, appends an incrementing counter.
# - Moves the file to the current directory and prints the operation.
#
# Dependencies:
# - Requires Ruby's "fileutils" standard library.
#
# Usage:
# ruby movefiles.rb
#!/usr/bin/env ruby
require "fileutils"
FOLDER = __dir__
LOT3 = File.join(FOLDER, "lot3")
Dir.children(LOT3).each do |fname|
next if fname.start_with?("delete_") # skip duplicates
next if fname.start_with?(".") # skip hidden/AppleDouble files
path = File.join(LOT3, fname)
next if File.directory?(path)
# Get file creation datetime (macOS: birthtime)
ctime = File.stat(path).birthtime
timestamp = ctime.strftime("%Y%m%d_%H%M%S")
# Split name and extension
base = File.basename(fname, ".*")
ext = File.extname(fname)
new_name = "#{base}_#{timestamp}#{ext}"
new_path = File.join(FOLDER, new_name)
# Ensure we don't overwrite existing files
counter = 1
while File.exist?(new_path)
new_name = "#{base}_#{timestamp}_#{counter}#{ext}"
new_path = File.join(FOLDER, new_name)
counter += 1
end
# Move the file
FileUtils.mv(path, new_path)
puts "Moved: #{fname} -> #{new_name}"
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment