Last active
November 6, 2019 02:43
-
-
Save biancadanforth/c4790230c5a2702c8a64f62cbf39dc6a to your computer and use it in GitHub Desktop.
Check if two JSON objects are the same by first ordering them
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import json, os | |
# Put filenames here; this script assumes these files are in the same dir as the script | |
FILENAME_1 = "2.json" | |
FILENAME_2 = "3.json" | |
def ordered(obj): | |
if isinstance(obj, dict): | |
return sorted((k, ordered(v)) for k, v in obj.items()) | |
if isinstance(obj, list): | |
return sorted(ordered(x) for x in obj) | |
else: | |
return obj | |
def main(): | |
files = [FILENAME_1, FILENAME_2] | |
ordered_files = [] | |
for filename in files: | |
path = os.path.join(os.path.dirname(__file__), filename) | |
with open(path) as f: | |
file_parsed = json.load(f) | |
file_ordered = ordered(file_parsed) | |
ordered_files.append(file_ordered) | |
new_path = os.path.join(os.path.dirname(__file__), f"{os.path.splitext(filename)[0]}_prettier.json") | |
with open(new_path, "w+") as new_file: | |
json.dump(file_ordered, new_file, indent=4, sort_keys=True) | |
print(ordered_files[0] == ordered_files[1]) | |
if __name__ == '__main__': | |
main() |
Also thanks to @mythmon for giving this a look over! My Python skills are quite basic. Latest revision (6) with his feedback.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
This is a helper script I made while reviewing @danielhertenstein's FathomFox PR to parallelize the Vectorizer. I wanted to know if the resulting
vectors.json
files in both the serialized Vectorizer and parallelized Vectorizer were identical for the same samples and same ruleset. Since the parallelized Vectorizer can finish pages in a different order, I needed to sort each JSON object first before making a comparison. Thankfully the two outputs were the same.Edit: Credit for the
ordered
function is from this Stack Overflow post.