Skip to content

Instantly share code, notes, and snippets.

@Frontear
Created September 3, 2022 07:20
Show Gist options
  • Save Frontear/b33266bede1cdf2dbfb74326864c17a0 to your computer and use it in GitHub Desktop.
Save Frontear/b33266bede1cdf2dbfb74326864c17a0 to your computer and use it in GitHub Desktop.
An attempt to try and rewrite the metadata for media downloaded via Google Takeout
import re
import os
import sys
import json
from pathlib import Path
#from tinytag import TinyTag
#import ffmpeg
#from PIL import Image
#from PIL.ExifTags import TAGS
import filedate
PHOTOS = [
".JPEG",
".JPG",
".PNG",
".HEIC",
".GIF"
]
VIDEOS = [
".MOV",
".MP4"
]
if __name__ == "__main__":
takeout_folder = sys.argv[1]
FILE_META = {}
for root, _, files in os.walk(takeout_folder):
for media in files:
if media.endswith(".json"):
with open(Path(root, media)) as f:
meta = json.load(f)
FILE_META[meta["title"]] = meta["photoTakenTime"]["formatted"]
for root, _, files in os.walk(takeout_folder):
for media in files:
if m := re.match(r".+\..+(\(\d\))\.json", media):
splt = media.split(".")
os.rename(Path(root, media), Path(root, splt[0] + m.group(1) + "." + splt[1][:-3] + "." + splt[2]))
for root, _, files in os.walk(takeout_folder):
for media in files:
ext = os.path.splitext(media)[1].upper()
if ext in PHOTOS or ext in VIDEOS:
try:
filedate.File(Path(root, media)).set(created=FILE_META[media])
except KeyError as e1:
try:
with open(Path(root, f"{media}.json")) as f:
filedate.File(Path(root, media)).set(created=json.load(f)["photoTakenTime"]["formatted"])
except FileNotFoundError as e2:
print(f"cant process: {media}", e2)
"""
if not media.endswith(".json"):
ext = os.path.splitext(media)[1].upper()
media_path, meta_path = Path(root, media), Path(root, f"{media}.json")
if ext in VIDEOS or ext in PHOTOS:
with open(meta_path, "r") as f:
meta = json.load(f)
filedate.File(media_path).set(created = meta["photoTakenTime"]["formatted"])
"""
"""
if ext in PHOTOS:
img = Image.open(path)
print(img.getexif())
exit()
elif ext in VIDEOS:
#vid = ffmpeg.probe(path)
" "[0]
else:
print(f"Unhandled file: {path}")
"""
ffmpeg-python==0.2.0
filedate==2.0
future==0.18.2
Pillow==9.2.0
python-dateutil==2.8.2
six==1.16.0
tinytag==1.8.1
@Frontear
Copy link
Author

Frontear commented Sep 10, 2022

Little update:

  1. Some of the json metadata files being trimmed is indeed due to file naming constraints, though the real reason is still a mystery. Being said, majority of the files name does exist within the json, so it usually becomes a matter of finding the closest one. Alternatively, the json should 'in theory' have the proper file name inside of the json metadata file under the 'title' value, though i have observed sometimes this is also mangled for no particular reason.
  2. Some jpg files are named jpeg and vice versa is their accompanying metadata files, which is incredibly infuriating and makes parsing a pain.
  3. Live photos are uploaded as HEIC and MP4 pairs, where the MP4 serves to be the video, and the HEIC is apples photo format (i think? im not 100% lol). This explains the lack of a metadata for the other pair, since google exports for only one of them, I believe the HEIC alone.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment