Last active
August 29, 2015 14:07
-
-
Save paulrobertlloyd/3c2882f542d344d1a982 to your computer and use it in GitHub Desktop.
Iterate though matched patterns in document, perform a function on each match, and write result to file
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
desc "Replace email addresses in remarks with md5 hashed strings" | |
task :hash do |t| | |
FileList.new('source/_data/remarks/*.yml').each do |path| | |
File.open(path, 'r+:utf-8') do |file_name| | |
require 'digest/md5' | |
private :hash | |
def hash(email) | |
email_address = email ? email.downcase.strip : '' | |
Digest::MD5.hexdigest(email_address) | |
end | |
contents = File.read(file_name) | |
email_regex = /([_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,4}))/i | |
@replace = contents.gsub(email_regex) { |m| hash(m) } | |
File.open(file_name, "w") { |file| file.puts @replace } | |
end | |
end | |
end |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
- author: John Doe | |
email: [email protected] | |
date: '2014-09-30 10:44:46 +0100' | |
url: example.com | |
content: | | |
When are you going to San Francisco? I think there is a chance that I will be around there too! | |
- author: Mary Smith | |
email: [email protected] | |
date: '2014-09-30 10:57:27 +0100' | |
url: example.com | |
content: | | |
I'll be there in August. | |
- author: David Smith | |
email: [email protected] | |
date: '2014-10-01 00:47:05 +0100' | |
content: | | |
I'll be around. Fancy meeting up? |
Actually that scan is dumb, just
contents.gsub!(@email_regex) {|m| hash m }
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
With your current regex:
You could lose the
map
by removing all the capture groups from the regex.