Last active
May 8, 2023 03:06
-
-
Save jelc53/6298cc8df392e6da120f0100f7f378f1 to your computer and use it in GitHub Desktop.
runsheet for coqgym's astactic evaluation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# checkout main git branch | |
git pull | |
git checkout master | |
# download and scp coqgym's astactic best model into model directory | |
# https://drive.google.com/drive/folders/1AzLaEpoGS3BPMUz9Bl63MHAFRqlF4CtH | |
scp model.pth [username]@scq2.stanford.edu:/scr2/[user_dir]/CoqGym-GNN/ASTactic/runs/astactic/checkpoints/model.pth | |
# evaluation runsheet (5 left to go!) | |
"weak-up-to", -- done | |
"buchberger", -- done | |
"jordan-curve-theorem", -- done | |
"dblib", -- done | |
"disel", -- done | |
"zchinese", -- done | |
"zfc", -- done | |
"dep-map", -- done | |
"chinese", -- done | |
"UnifySL", -- done | |
"hoare-tut", -- done | |
"huffman", -- done | |
"PolTac", -- done | |
"angles", -- done | |
"coq-procrastination", -- done | |
"coq-library-undecidability", -- done | |
"tree-automata", | |
"coquelicot", -- done | |
"fermat4", -- done (skipped gcd2_relp_odd) | |
"demos", | |
"coqoban", -- done | |
"goedel", -- done | |
"verdi-raft", | |
"verdi", | |
"zorns-lemma", -- done | |
"coqrel", -- done | |
"fundamental-arithmetics" | |
# version 1: run script for single proof library evaluation | |
python evaluate.py ours astactic --path runs/astactic/checkpoints/model.pth --filter [proof_library] >> log | |
# version 2: multiprocess run script for evaluation of one or many proof libraries | |
# open multiprocess_test_astactic.py and edit list of proof libraries | |
python multiprocess_test_astactic.py astactic runs/astactic/checkpoints/model.pth | |
# optional: you may want to add code to evaluate.py and agent.py | |
# to skip proof libraries or specific proofs within those libraries | |
python evaluate.py ... --skip "lalcii acii [...]" | |
# current status of evaluation is shared on our shared gdrive in evaluation/test_astactic.tgz | |
# https://drive.google.com/file/d/1aD8BR87Q2iUaMmFZWoAfmATx-Tm3uSX8/view?usp=share_link |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment