Created
January 27, 2014 10:51
-
-
Save robert-b-clarke/8646606 to your computer and use it in GitHub Desktop.
A simple python script for copying static web resources to an S3 bucket and advance gzipping JS and CSS. Let me know if it's useful (and not already implemented by something else), I may make it into a proper repo
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
""" | |
=========== | |
Description | |
=========== | |
Simple script to copy and gzip static web files to an AWS S3 bucket. S3 is great for cheap hosting of static web content, but by default it does not gzip CSS and JavaScript, which results in much larger data transfer and longer load times for many applications | |
When using this script CSS and JavaScript files are gzipped in transition, and appropriate headers set as per the technique described here: http://www.jamiebegin.com/serving-compressed-gzipped-static-files-from-amazon-s3-or-cloudfront/ | |
* Files overwrite old versions | |
* Orphaned files are not deleted | |
* S3 will not negotiate with clients and will always serve the gzipped version, so user agents must be able to understand the Content-Encoding:gzip header (all modern web browsers can) | |
============= | |
Prerequisites | |
============= | |
Python >= v2.7 | |
boto | |
install with pip: | |
pip install boto | |
or with apt-get | |
apt-get install python-boto | |
===== | |
Usage | |
===== | |
From the command line | |
python deploy_to_s3.py --directory source-dir --bucket bucket-name | |
The standard boto environment variables AWS_SECRET_ACCESS_KEY and AWS_ACCESS_KEY_ID are used for authentication - see boto for details | |
For help: | |
python deploy_to_s3.py --help | |
""" | |
#!/usr/bin/python | |
__author__ = '[email protected]' | |
import os, sys, argparse, tempfile, gzip | |
from boto.s3.connection import S3Connection | |
from boto.s3.key import Key | |
def add_file(source_file, s3_key): | |
"""write a file to an s3 key""" | |
if source_file.endswith(".js") or source_file.endswith(".css"): | |
print("gzipping %s to %s" %(source_file, s3_key.key)) | |
gzip_to_key(source_file, s3_key) | |
else: | |
print("uploading %s to %s" %(source_file, s3_key.key)) | |
s3_key.set_contents_from_filename(source_file) | |
def gzip_to_key(source_file, key): | |
tmp_file = tempfile.NamedTemporaryFile(mode="wb", suffix=".gz", delete=False) | |
with open(source_file, 'rb') as f_in: | |
with gzip.open(tmp_file.name, 'wb') as gz_out: | |
gz_out.writelines(f_in) | |
key.set_metadata('Content-Type', 'application/x-javascript' if source_file.endswith(".js") else 'text/css') | |
key.set_metadata('Content-Encoding', 'gzip') | |
key.set_contents_from_filename(tmp_file.name) | |
os.unlink(tmp_file.name) #clean up the temp file | |
def dir_to_bucket(src_directory, bucket): | |
"""recursively copy files from source directory to boto bucket""" | |
for root, sub_folders, files in os.walk(src_directory): | |
for file in files: | |
abs_path = os.path.join(root, file) | |
rel_path = os.path.relpath(abs_path, src_directory) | |
#get S3 key for this file | |
k = Key(bucket) | |
k.key = rel_path | |
add_file(abs_path, k) | |
def main(): | |
#get arguments | |
arg_parser = argparse.ArgumentParser(description='Deploy static web resources to an S3 bucket, gzipping JavaScript and CSS files in the process') | |
arg_parser.add_argument('-d','--directory', help='The source directory containing your static website files', required=True) | |
arg_parser.add_argument('-b','--bucket', help='The name of the bucket you wish to copy files to, the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY environment variables are used for your credentials', required=True) | |
args = arg_parser.parse_args() | |
#connect to S3 | |
conn = S3Connection() | |
target_bucket = conn.get_bucket(args.bucket, validate=False) | |
dir_to_bucket(args.directory, target_bucket) | |
if __name__ == '__main__': | |
main() |
Hi @mafux777
I think this is the same issue as this http://stackoverflow.com/questions/27652318/cant-connect-to-s3-buckets-with-periods-in-their-name-when-using-boto-on-herok
I'm not sure if it's fixed in newer versions of boto, but the workaround described in that stackoverflow answer should work
Thanks
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hey Rob, I had a problem when my bucket had a dot, like so: fancyname.io
When I used a different bucket without the dot, it worked. Error messages:
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 1049, in endheaders
self._send_output(message_body)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 893, in _send_output
self.send(msg)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 855, in send
self.connect()
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 1274, in connect
server_hostname=server_hostname)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/ssl.py", line 352, in wrap_socket
_context=self)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/ssl.py", line 579, in init
self.do_handshake()
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/ssl.py", line 816, in do_handshake
match_hostname(self.getpeercert(), self.server_hostname)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/ssl.py", line 271, in match_hostname
% (hostname, ', '.join(map(repr, dnsnames))))
ssl.CertificateError: hostname 'fenestro.io.s3.amazonaws.com' doesn't match either of '*.s3.amazonaws.com', 's3.amazonaws.com'
Is this easily fixed? If so, how?