Last active
June 12, 2021 21:43
-
-
Save stephenturner/5388076 to your computer and use it in GitHub Desktop.
Get a rough word count of the entire US Code of Law
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Replace the parallel stuff with xargs if you don't have GNU parallel, or just pipe to 'sh'. | |
# Use 'wget' on linux, 'curl -O' on mac. | |
for i in $(seq -f "%02.f" 1 51); do echo "curl -O http://uscode.house.gov/download/pls/Title_$i.ZIP"; done | parallel | |
find *ZIP | parallel --dry-run unzip {} | |
cat *txt | wc | |
## Results: | |
# cat *txt | wc | |
# 6546729 45920853 338309328 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Install GNU Parallel in 30 seconds: wget pi.dk/3 -qO - | sh -x