Last active
August 29, 2015 14:04
-
-
Save olleolleolle/4026a7f5dcdf1a19e57e to your computer and use it in GitHub Desktop.
Improved by adding another S3 DEFAULT_HOST, putting the keys on the command-line.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# This is a quick script for doing a mass rename of all files in an | |
# Amazon S3 bucket. | |
# In this case, the rename operation was to unescape all filenames which | |
# had been previously escaped in error. | |
############################# | |
# Configuration: | |
bucketname = ENV.fetch('S3_BUCKET_NAME') | |
access_key = ENV.fetch('S3_ACCESS_KEY_ID') | |
secret_key = ENV.fetch('S3_SECRET_ACCESS_KEY') | |
############################# | |
require 'rubygems' | |
require 'aws/s3' | |
AWS::S3::DEFAULT_HOST.replace 's3-eu-west-1.amazonaws.com' # You may have to change this. | |
include AWS::S3 | |
require 'cgi' | |
# Some obscure regular expression you do not need. | |
RE = /\d{4}\-\d{1,2}\-\d{1,2}_\d{10}_\d+_google_[a-z\.]+_(\d+)\.tar\.gz/ | |
def new_name(old_name) | |
search_id = RE.match(old_name).captures.first | |
"scrape-#{search_id}.tar.gz" | |
end | |
Base.establish_connection!(access_key_id: access_key, | |
secret_access_key: secret_key) | |
b = Bucket.find(bucketname) | |
marker = '' | |
while b.size > 0 | |
puts "\n\n--------------------new page----------------------" | |
puts "\n From marker #{marker}" | |
puts "\n\n--------------------------------------------------" | |
b.each do |s3o| | |
next unless RE =~ s3o.key | |
begin | |
S3Object.copy(s3o.key, new_name(s3o.key), bucketname) | |
puts "Copied #{s3o.key} to #{new_name(s3o.key)}" | |
# Uncomment this if you're feeling confident and want to delete the key | |
# s3o.delete | |
rescue => e | |
puts "\n\n @@@@@@@@@@@@ EXCEPTION on key #{s3o.key} \n\n" | |
puts e.message | |
puts '@@@@@@@@@@@@@@}' | |
next | |
end | |
end | |
marker = b.objects.last.key | |
b = Bucket.find(bucketname, marker: marker) | |
end |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment