Last active
October 29, 2019 18:15
-
-
Save benaisc/f03966796ab1bfc8916bf4c7de1f4415 to your computer and use it in GitHub Desktop.
create_DAGAN_database
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env python3 | |
# coding: utf8 | |
import numpy as np | |
from pathlib import Path | |
import matplotlib.image as mpimg | |
""" | |
Suppose a file tree i.e: | |
imgs/ | |
|class1/ | |
|images of type class1.png | |
|class2/ | |
|... | |
|... | |
the images are normalized into numpy arrays in the form: (n_classes, n_samples, h, w, c) | |
""" | |
def create_db(images_dir_path): | |
if not Path(images_dir_path).is_dir(): | |
print('Error') | |
exit() | |
dataset = [] | |
for d in Path(images_dir_path).glob('*'): | |
# skipping simple files (readme, licences, ...) | |
if not d.is_dir(): | |
continue | |
classData = [] | |
for f in Path(d).glob('*.png'): | |
img = mpimg.imread(str(f)) | |
img = img.astype(np.float) | |
img /= 255.0 | |
img = np.reshape(img, newshape=(img.shape[0], img.shape[1], 3)) | |
classData.append(img) | |
dataset.append(classData) | |
dataset.sort(key=len) | |
return np.array(dataset) | |
train_dir = '/path/to/imgs' | |
data = create_db(train_dir) | |
print("dataset shape:", data.shape) | |
np.save('my_database.npy', data) |
Hi,
Talking about shapes implies working with array, and not lists.
Calculating the shape of an array result in the number of his elements plus their dimension (see doc
So, np.array(classDataSet).shape would give us a 4-dim array (num_samples, h, w, c).
Hi,
Reading this (a year later, seeing that people still scratch their brains with some DAGAN experiments), I feel like to continue my previous comment :
...
Following the same logic, data.shape will give you (num_classes, num samples, h, w, c)
[I also updated the script to make the things easier for you :)]
This suppose that all your images are png of the same shape.
Use np.reshape(img, newshape=(img.shape[0], img.shape[1], 1)) if your images are all black and white.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
hi gurujam ,
Thanks for the script
I am also preparing my dataset into 5-dim array. Please correct me if I am missing : you basically appending each image into 'classDataSet' and having 3-dim (h,w,c) . Then appending again into 'dataset' which is being 4-dim (num_example, h,w,c). isn't it supposed to be 5-dim?