Last active
February 28, 2018 22:25
-
-
Save cquest/37d8e2b9c0c07df6cb044c2effdae2fb to your computer and use it in GitHub Desktop.
Extraction des changements de SIRET des fichiers SIRENE quotidiens
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# extraction des changements de SIRET dans les fichiers de maj quotidien SIRENE | |
# génère un fichier CSV avec SIRET_OLD, SIRET_NEW et DATEMAJ | |
# unzip = décompression du zip | |
# csvcut = ectraction des colonnes qui nous intéressent + conformation CSV | |
# egrep = extraction de slignes qui nous intéressent | |
# csvsql = rapprochement pour obtenir le lien ancien -> nouveau SIRET | |
FILE=$(echo $1 | sed 's/.zip//') | |
unzip $1 | |
csvcut -c SIREN,DATEMAJ,NIC,SIRETPS,NICSIEGE,VMAJ,EVE -d ';' -e iso8859-1 sirc* | \ | |
egrep '(^SIREN|,(CTE|CTS|MTDE|MTAE|MTDS|MTAS|STE|STS|SU)$)' | \ | |
csvsql --query 'select o.siren||o.nic as SIRET_OLD, n.siren||n.nic as SIRET_NEW, o.datemaj from stdin o join stdin n on (o.siren=n.siren and o.datemaj=n.datemaj and o.nic<n.nic);' > $FILE-histo.csv | |
rm sirc* |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment