Created
August 26, 2012 17:22
-
-
Save drjerry/3481798 to your computer and use it in GitHub Desktop.
Converts stream of tabular records to stream of JSON records.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
if (NR == 1) { | |
split($0, tags); | |
if (EC == "") EC = "\""; | |
} | |
else { | |
split($0, vals); | |
jrec = "{"; | |
for (i = 1; i <= NF; ++i) { | |
if (vals[i] ~ /[^0-9.]/) | |
jrec = jrec EC tags[i] EC ":" EC vals[i] EC; | |
else | |
jrec = jrec EC tags[i] EC ":" vals[i]; | |
if (i < NF) | |
jrec = jrec ", "; | |
else | |
jrec = jrec "},"; | |
} | |
print jrec; | |
} | |
} |
great script,
One small bug, is it possible to remove the last comma on the last record?
the last comma problem is corrected by adding two lines to the script, seen here:
minkymorgan / tab2json.awk
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
The column headers must be the first record of the stream (or file) and are used as the field keys. By default, in the output, all keys are enclosed in quotes, as are non-numeric values. The field separator of the input stream should be specified via the
-F
argument.Example: Convert plain CSV into JSON
Example 2:
MySQL uses the tab character as its default field separator:
The Enclosing Character
By default double-quotes are used to enclose string values, but this can be changed to any other delimiter via the
EC
variable. For example, to transform the CSV file using single-quotes instead of double quotes: