Skip to content

Instantly share code, notes, and snippets.

View iyvinjose's full-sized avatar

iyvin jose iyvinjose

  • VMware by Broadcom
  • Palo Alto, California, USA
  • 07:16 (UTC -07:00)
View GitHub Profile
@iyvinjose
iyvinjose / data_loading_utils.py
Last active March 27, 2025 14:12
Read large files line by line without loading entire file to memory. Supports files of GB size
def read_lines_from_file_as_data_chunks(file_name, chunk_size, callback, return_whole_chunk=False):
"""
read file line by line regardless of its size
:param file_name: absolute path of file to read
:param chunk_size: size of data to be read at at time
:param callback: callback method, prototype ----> def callback(data, eof, file_name)
:return:
"""
def read_in_chunks(file_obj, chunk_size=5000):