Last active
September 3, 2025 04:19
-
-
Save shivabhusal/79d10b296c91bd57b989be5fb092b19e to your computer and use it in GitHub Desktop.
Find Duplicate Video Files While backing up your mobile videos to drives
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# This script scans media files in its own directory and a subdirectory named "lot3". | |
# It compares the duration of files in "lot3" against files in the parent directory. | |
# If a file in "lot3" matches a parent file by name (case-insensitive) and their durations | |
# differ by no more than DURATION_TOLERANCE seconds, the "lot3" file is renamed to | |
# "delete_<original_name>" to mark it as a duplicate. | |
# | |
# Dependencies: | |
# - Uses macOS `mdls` command to read media duration metadata. | |
# - Requires Ruby standard library 'shellwords'. | |
# | |
# Constants: | |
# - FOLDER: Directory where the script resides. | |
# - LOT3: Subdirectory "lot3" inside FOLDER. | |
# - DURATION_TOLERANCE: Allowed difference in duration (seconds) to consider files as duplicates. | |
# | |
# Functions: | |
# - media_duration(path): Returns the duration (in seconds) of a media file using `mdls`. | |
# | |
# Workflow: | |
# 1. Collects all non-hidden, non-directory files in FOLDER, storing their durations. | |
# 2. Iterates over files in LOT3, skipping hidden and directory files. | |
# 3. For each LOT3 file, checks for a matching parent file by name and compares durations. | |
# 4. If durations are within tolerance, renames the LOT3 file to "delete_<original_name>". | |
# 5. Handles errors and skips renaming if the target name already exists. | |
#!/usr/bin/env ruby | |
# Target folder = the folder where this script resides | |
FOLDER = __dir__ | |
LOT3 = File.join(FOLDER, "lot3") | |
# Allowed difference in duration (seconds) | |
DURATION_TOLERANCE = 0.5 | |
require "shellwords" | |
def media_duration(path) | |
output = `mdls -name kMDItemDurationSeconds #{Shellwords.escape(path)} 2>/dev/null`.strip | |
return nil if output.empty? || !output.include?("=") | |
value = output.split("=").last.strip | |
value.to_f if value != "(null)" | |
end | |
# Collect parent files with their duration | |
parent_files = {} | |
Dir.children(FOLDER).each do |fname| | |
next if fname.start_with?(".") # skip hidden files | |
path = File.join(FOLDER, fname) | |
next if File.directory?(path) || path.start_with?(LOT3) | |
duration = media_duration(path) | |
next unless duration # skip if ffprobe can't read | |
parent_files[fname.downcase] = duration | |
end | |
# Check files in lot3 | |
Dir.children(LOT3).each do |fname| | |
next if fname.start_with?(".") # skip hidden files | |
path = File.join(LOT3, fname) | |
next if File.directory?(path) | |
parent_duration = parent_files[fname.downcase] | |
next unless parent_duration | |
lot3_duration = media_duration(path) | |
next unless lot3_duration | |
diff = (lot3_duration - parent_duration).abs | |
if diff <= DURATION_TOLERANCE | |
new_name = "delete_" + fname | |
new_path = File.join(LOT3, new_name) | |
if !File.exist?(new_path) | |
begin | |
File.rename(path, new_path) | |
puts "Renamed duplicate: #{fname} -> #{new_name} (duration diff: #{diff.round(3)}s)" | |
rescue => e | |
warn "Failed to rename #{path}: #{e.message}" | |
end | |
else | |
warn "Skipped #{fname}, target already exists: #{new_name}" | |
end | |
end | |
end |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# This script moves files from the "lot3" subdirectory into the current directory, | |
# renaming each file to include its creation timestamp (macOS birthtime). | |
# It skips files starting with "delete_" (assumed duplicates) and hidden files. | |
# If a file with the new name already exists, a counter is appended to avoid overwriting. | |
# | |
# Steps: | |
# - Iterates over each file in "lot3" (excluding directories, duplicates, and hidden files). | |
# - Extracts the file's creation time and formats it as "YYYYMMDD_HHMMSS". | |
# - Renames the file to "originalname_timestamp.ext". | |
# - If a naming conflict occurs, appends an incrementing counter. | |
# - Moves the file to the current directory and prints the operation. | |
# | |
# Dependencies: | |
# - Requires Ruby's "fileutils" standard library. | |
# | |
# Usage: | |
# ruby movefiles.rb | |
#!/usr/bin/env ruby | |
require "fileutils" | |
FOLDER = __dir__ | |
LOT3 = File.join(FOLDER, "lot3") | |
Dir.children(LOT3).each do |fname| | |
next if fname.start_with?("delete_") # skip duplicates | |
next if fname.start_with?(".") # skip hidden/AppleDouble files | |
path = File.join(LOT3, fname) | |
next if File.directory?(path) | |
# Get file creation datetime (macOS: birthtime) | |
ctime = File.stat(path).birthtime | |
timestamp = ctime.strftime("%Y%m%d_%H%M%S") | |
# Split name and extension | |
base = File.basename(fname, ".*") | |
ext = File.extname(fname) | |
new_name = "#{base}_#{timestamp}#{ext}" | |
new_path = File.join(FOLDER, new_name) | |
# Ensure we don't overwrite existing files | |
counter = 1 | |
while File.exist?(new_path) | |
new_name = "#{base}_#{timestamp}_#{counter}#{ext}" | |
new_path = File.join(FOLDER, new_name) | |
counter += 1 | |
end | |
# Move the file | |
FileUtils.mv(path, new_path) | |
puts "Moved: #{fname} -> #{new_name}" | |
end |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment