This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
""" | |
A script for deleting large numbers of duplicates from the output .txt file of | |
the open source Duplciate Files Finder application (https://sourceforge.net/projects/doubles/). | |
Given a list of duplicate files, the one with the shortest path (by character count) | |
is kept and all the rest are deleted. If several have the same length and there are | |
none shorter, then the least "alphabetically" is kept. | |
I used this to reduce a heavily duplicated picture archive from 121Gb to 57Gb. There | |
wasn't really a best way to decide which to delete so the "least path" logic above was |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
const fs = require('fs'); | |
const file = JSON.parse(fs.readFileSync('./dump.har')).log; | |
const targetMimeType = 'image/jpeg'; | |
let count = 1; | |
for (const entry of file.entries) { | |
if (entry.response.content.mimeType === targetMimeType) { | |
// ensure output directory exists before running! | |
fs.writeFileSync(`output/${count}.png`, new Buffer(entry.response.content.text, 'base64'), 'binary'); | |
count++; |