Skip to content

Instantly share code, notes, and snippets.

@shakyaabiral
Last active September 11, 2023 13:27
Show Gist options
  • Select an option

  • Save shakyaabiral/156f930069300ecc9668be234d8ad3a2 to your computer and use it in GitHub Desktop.

Select an option

Save shakyaabiral/156f930069300ecc9668be234d8ad3a2 to your computer and use it in GitHub Desktop.
Python Script to dump or load data using redis
"""
Install the following requirements into your virtual environemnt
`pip install click redis`
Usage:
To load data into redis
python redis_dump.py load [filepath]
To dump data into redis
python redis_dump.py dump [filepath] --search '*txt'
"""
import click
import redis
import json
import logging
import os
@click.command()
@click.argument('action')
@click.argument('filepath')
@click.option('--search', help="Key search patter. eg `*txt`")
def main(action, filepath, search):
r = redis.StrictRedis(host='127.0.0.1', port=6379, db=0) # update your redis settings
cache_timeout = None
if action == 'dump':
out = {}
for key in r.scan_iter(search):
out.update({key: r.get(key)})
if len(out) > 0:
try:
with open(filepath, 'w') as outfile:
json.dump(out, outfile)
print('Dump Successful')
except Exception as e:
print(e)
else:
print("Keys not found")
elif action == 'load':
try:
with open(filepath) as f:
data = json.load(f)
for key in data:
r.set(key, data.get(key), cache_timeout)
print('Data loaded into redis successfully')
except Exception as e:
print(e)
if __name__ == '__main__':
log_fmt = '%(asctime)s - %(name)s - %(levelname)s - %(message)s'
logging.basicConfig(level=logging.INFO, format=log_fmt)
main()
@vveliev-tc

Copy link
Copy Markdown

Thanks for sharing
I think there are faster load approaches available (If load speed is important for the task):

  • the fastest way would be to use redic-cli (I was getting about 150k OPS on local docker setup)
  • python with redis pipeline on the same machine was giving me about 5k OPS

here is the code snippet

    def process_file(self,file_path):
        chunksize = 10000
        pipe = self.redis.pipeline()
        with open(file_path,"r") as process_f:
            line_index = 0
            for line in process_f:
                try:
                    line_index += 1
                    first_name, last_name = line.strip().split(' ')
                    if not last_name:
                        next
                    user = first_name.strip() + '.' + last_name.strip()       
                    pipe.sadd('users', user.lower())
                    if line_index % chunksize == 0:
                        pipe.execute()
                        logging.info("Processing index %s", line_index)
                except: 
                    logging.info("Something else went wrong")
                finally:
                    pipe.execute() 
            logging.info("Number of records %s", self.conn.scard('users'))

@IliaFeldgun

Copy link
Copy Markdown

r.get(key) won't work with value types other than string.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment