I have MRI data from a group of patients with brain damage (due to stroke, head injury, etc). For each patient, I have a binary MRI image (3D array) for each patient, where a value of 0 or 1 indicates the presence or absence of damage at a particular voxel (3D pixel). Each image is ~870k voxels.
Return a list of patients with damage at a given voxel.
Here's the stupid/inelegant way to do this:
voxel_id = 12345 # Index of the voxel I'm interested in
patient_matches = []
for patient in patient_list:
patient_img_file = '/path/to/images/{}.MRI'.format(patient) # Get the path to the image for this patient.
img = load_image(patient_img_file) # Load the image as a NumPy ndarray.
img_flat = img.ravel() #Flatten to a 1D
vox_val = img[voxel_id]
if vox_val == 1:
patient_matches.append(patient)
- Load and flatten each image, so I have a 1x870k vector for each patient.
- Read these vectors into a DB (
table='damage'
), with 1 row per patient and 1 column per voxel. - Query
select patient_id from damage where vox12345 = 1
A quick test on a vm on a slow laptop:
I'm seeing 100MB process memory usage and 1000 accesses taking 45ms.
Rough guess as to postgres database size would be ( 100 * 870k * 5% (damage)) * (2 * 4bytes/int + header) * 2 (index) or roughly the same.