sloanlance/jq_jsonl_conversion.md

Last active November 11, 2025 23:06

Star (98) You must be signed in to star a gist
Fork (9) You must be signed in to fork a gist

Select an option

Learn more about clone URLs
Clone this repository at <script src="https://gist.github.com/sloanlance/c3bf746b6396f60d321f5535e1ced892.js"></script>
Save sloanlance/c3bf746b6396f60d321f5535e1ced892 to your computer and use it in GitHub Desktop.

Download ZIP

jq: JSONL ↔︎ JSON conversion

Raw

jq_jsonl_conversion.md

jq: JSONL ↔︎ JSON conversion

Prerequisites

jq — https://jqlang.org/ — "like sed for JSON data"

There are several options available for installing jq. I prefer to use Homebrew: brew install jq
JSONL → JSON
```
jq -s '.' input.jsonl > output.json
```
JSON → JSONL
```
jq -c '.[]' input.json > output.jsonl
```

Note: This document is now included in Cookbook · jqlang/jq Wiki.

frfernandezdev commented Dec 22, 2023

🫶

delano commented Mar 22, 2024

Legend

Author

sloanlance commented Mar 22, 2024 via email •

edited

Loading

💚 _On Thu, 21 Dec 2023 at 19:01, @frfernandezdev wrote:_

…

🫶

Author

sloanlance commented Mar 22, 2024 via email •

edited

Loading

Thank you! I can't take much credit, though. `jq` does all the work. 😉 _On Fri, 22 Mar 2024 at 13:47, @delano wrote:_

…

Legend

EdGaere commented Apr 1, 2025 •

edited

Loading

It's unbelievable, jq can solve this in one line. I was about to embark on writing yet-another-python-script.py to convert a gzipped nested JSON file to JSONL, but thankfully I came across this post.

Suppose you have a JSON like this:

{
   "meta" : { }

 , "data" : [
    { "idx" : 1
      , "input" : "ABC"
     , target : "123"
     , some_other_field : "zzz" 
    },

   { "idx" : 2
      , "input" : "DEF"
     , target : "456"
     , some_other_field : "zzz" 
    }, 
   ...

 ]
}

You can use the following one-line command to extract the 'data' array, keep the 'input' and 'target' fields only, and generate JSONL:
gunzip -c somefile.json.gz | jq .data | jq -c '.[] | {input, target}'

Output

{"input" : "ABC, "target" : "123"}
{"input" : "DEF, "target" : "456"}
...

Author

sloanlance commented May 20, 2025 •

edited

Loading

@EdGaere, thanks for the example! I thought I could improve (i.e., shorten) the command you wrote. In your example, you called jq twice in the pipeline, but it can be done with one call instead…

gzcat somefile.json.gz | jq -c '.data[] | {input, target}'

I.e., combining the filters: .data + .[] → .data[].

Cleaning the test data

I wanted to test this with your data. `jq` didn't like your hand-edited data's unquoted keys, like `some_other_field`, so I cleaned up the data first…

{
  "meta": {},
  "data": [
    {
      "idx": 1,
      "input": "ABC",
      "target": "123",
      "some_other_field": "zzz"
    },
    {
      "idx": 2,
      "input": "DEF",
      "target": "456",
      "some_other_field": "zzz"
    }
  ]
}

(Maybe jq has some option to ignore errors like unquoted keys.)

Using the test data

Running the shortened command I gave above gives the output…

{"input":"ABC","target":"123"}
{"input":"DEF","target":"456"}

Notice that the -c option for jq compacts the output without whitespace in each record. It's more compact than the hand-edited output of your example.

sloanlance/jq_jsonl_conversion.md

Select an option

No results found

Select an option

No results found

jq: JSONL ↔︎ JSON conversion

Prerequisites

JSONL → JSON

JSON → JSONL

frfernandezdev commented Dec 22, 2023

Uh oh!

delano commented Mar 22, 2024

Uh oh!

sloanlance commented Mar 22, 2024 via email •

edited

Loading

Uh oh!

sloanlance commented Mar 22, 2024 via email •

edited

Loading

Uh oh!

EdGaere commented Apr 1, 2025 •

edited

Loading

Uh oh!

sloanlance commented May 20, 2025 •

edited

Loading

Uh oh!

sloanlance/jq_jsonl_conversion.md

jq: JSONL ↔︎ JSON conversion

Prerequisites

JSONL → JSON

JSON → JSONL

frfernandezdev commented Dec 22, 2023

Uh oh!

delano commented Mar 22, 2024

Uh oh!

sloanlance commented Mar 22, 2024 via email • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sloanlance commented Mar 22, 2024 via email • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

EdGaere commented Apr 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sloanlance commented May 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Cleaning the test data

Using the test data

Uh oh!

sloanlance commented Mar 22, 2024 via email •

edited

Loading

sloanlance commented Mar 22, 2024 via email •

edited

Loading

EdGaere commented Apr 1, 2025 •

edited

Loading

sloanlance commented May 20, 2025 •

edited

Loading