Last active
January 14, 2023 10:14
-
-
Save OmarAlkousa/8d62a6751eaf860a37c4ba2e437512c8 to your computer and use it in GitHub Desktop.
This GitHub gist is for converting DICOM metadata into CSV file. You can see the DICOM dataset on Kaggle following the link: https://www.kaggle.com/datasets/dmisky/dlwptvolumetricdicomlung
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Implement the required package | |
# If you don't have PyDicom, uncomment the next line of code | |
# !pip install pydicom | |
import pydicom | |
import pandas as pd | |
import glob | |
def dicom2csv(extract = [], | |
move_on = [(0x7FE0,0x0008), (0x7FE0,0x0009), (0x7FE0,0x0010)], | |
folder_path = str(), | |
csv_file_name = "metadata.csv", | |
return_dataframe = False): | |
''' | |
Extract specific DICOM metadata from multiple DICOM files collected in a | |
specified folder. | |
::Params | |
- extract (list): | |
The keywords of the DICOM attributes you want to extract. | |
- move_on (list): | |
The tags of the DICOM attributes you don't want to move on. | |
By default,it contains a list of the tags of pixel data: | |
[(0x7FE0,0x0008), (0x7FE0,0x0009), (0x7FE0,0x0010)] | |
If you want to move on onto some attributes, it's recommended | |
to append its unique tags in addition to pixel data tags. | |
- folder_path (string): | |
Path of the folder that contains the DICOM files. | |
- csv_file_name (string): | |
The name of the CSV file. | |
- return_dataframe (bool): | |
if True, returns a pandas dataframe of the extracted data for | |
direct use. | |
:: Returns | |
- CSV file contains DICOM metadata specified the parameter extracted. | |
- Pandas dataframe when return_dataframe is set to True. | |
:: Example: | |
dicom2csv(extract = ['StudyDate'], | |
folder_path = 'content/dicomfolder', | |
csv_file_name = "Study_Dates.csv", | |
return_dataframe = True) | |
''' | |
# Initialize the meta dictionary that will have the specified attributes | |
meta = {keyword:[] for keyword in extract} | |
# List the files' names that we want to extract data from | |
dicom_files = glob.glob(folder_path+'/*.dcm') | |
# Iterate over each DICOM file in the folder and read it using dcmread() | |
for file_path in dicom_files: | |
# Read the DICOM file from the specified path | |
dcm = pydicom.dcmread(file_path) | |
# Iterate over the DICOM attributes in the current DICOM file "dcm" | |
for elem in dcm.iterall(): | |
# Ensure that the attribute is not a pixel data and it's one of the | |
# required attributes | |
if (elem.tag not in move_on) and (elem.keyword in extract): | |
# Append the value of the current attribute | |
meta[elem.keyword].append(elem.value) | |
# Create a pandas dataframe for better use of the data | |
df = pd.DataFrame(data=meta, columns = extract) | |
# Create the CSV file with the specified name | |
df.to_csv(csv_file_name, index=False) | |
# Return the extracted dataframe for direct use | |
if return_dataframe: | |
return df |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment