jq Command for JSON Processing with Practical Examples

Published October 10, 2024

JSON has become the default structured data format for most APIs and web services today. Sometimes, you just want to call a web service, that returns JSON, from the command line and print out a few specific things. Bash does not have a built-in JSON library, so we use a utility called jq to parse the JSON text. jq is a lightweight command line JSON processor.

Using curl and jq to get images from the Mars Rover

In this tutorial, you will learn how to read JSON input and combine with other utilities like curl to read and parse remote JSON objects. You will learn how to retrieve images from the Mars Perseverance Rover using the free API from NASA, and lists of AdOps publishers using their JSON feeds.

Where is jq?

jq is available on most Linux and even macOS systems. You can test it simply with jq on the command line.

OUTPUT:

$ jq
jq - commandline JSON processor [version 1.6]

Usage:  jq [options] <jq filter> [file...]
        jq [options] --args <jq filter> [strings...]
        jq [options] --jsonargs <jq filter> [JSON_TEXTS...]

jq is a tool for processing JSON inputs, applying the given filter to
its JSON text inputs and producing the filter's results as JSON on
standard output.

The simplest filter is ., which copies jq's input to its output
unmodified (except for formatting, but note that IEEE754 is used
for number representation internally, with all that that implies).

For more advanced filters see the jq(1) manpage ("man jq")
and/or https://stedolan.github.io/jq
...

If you get a jq not found error, you can install it.

Debian based systems

sudo apt install jq

Red Hat based systems

sudo yum install jq

jq filters

A jq program is a filter. It takes an input and produces an output based on the filter. A few filters produce single result and others produce multiple results. jq filters run on streams of JSON data.

By combining filters, we can perform various operations and transformations to our JSON data and get the information we want.

Inline JSON string

We will use this JSON string as an input:

{"first_name": "Arul", "last_name": "John"}

We will send this JSON string as an input to the jq program and have it print all the keys "Arul" and "John". This is what we see.

$ echo '{"first_name": "Arul", "last_name": "John"}' | jq '.'
{
  "first_name": "Arul",
  "last_name": "John"
}

External JSON file

If we have an external JSON file, we can specify the JSON file as an argument. For example, if our JSON file is called astronaut.json and contains this:

{
    "first_name": "Neil",
    "last_name": "Armstrong",
    "university": "Purdue University",
    "time_in_space": "8 days, 14 hours, 12 minutes and 30 seconds"
}

This should print out the keys and their values:

jq '.' astronaut.json

OUTPUT:

{
  "first_name": "Neil",
  "last_name": "Armstrong",
  "university": "Purdue University",
  "time_in_space": "8 days, 14 hours, 12 minutes and 30 seconds"
}

Getting key values using the .field operator

To get a key value, we use '.field' where field is the key or property. There are several options for printing values of properties. You can print values at root level, values in an array, values in a nested structure, and so on.

To get a root property value

For example, to get the first name in this JSON string, we do this:

echo '{"first_name": "Arul", "last_name": "John"}' | jq '.first_name'

OUTPUT:

"Arul"

To remove the double quotes

To remove the annoying double quotes, you can use a tr command:

$ echo '{"first_name": "Arul", "last_name": "John"}' | jq '.first_name' | tr -d '"'
Arul

To remove the double quotes, you can also use jq -r instead of chaining it with tr:

$ echo '{"first_name": "Arul", "last_name": "John"}' | jq -r '.first_name'
Arul

To print multiple values, just separate each field with a comma:

$ echo '{"first_name": "Arul", "last_name": "John"}' | jq '.first_name,.last_name' | tr -d '"'
Arul
John

By default, it prints each value on a separate line.

We noticed that when we print multiple values, they print on separate lines.

If we want to print multiple values on the same line, we use -r and String Interpolation, like this:

$ echo '{"first_name": "Arul", "last_name": "John"}' | jq -r '"\(.first_name) \(.last_name)"'
Arul John

Another way to print multiple values on the same line is by putting the values in an array and using @tsv. In this case, the multiple values are separated by a tab space. You get a tag-separated output.

$ echo '{"first_name": "Arul", "last_name": "John"}' | jq -r '[.first_name, .last_name] | @tsv'
Arul    John

To print multiple property values as an array, enclose the filter with square brackets like this:

$ echo '{"first_name": "Arul", "last_name": "John"}' | jq '[.first_name, .last_name]' | tr -d '"'
[
  Arul,
  John
]

If the JSON string contains a nested structure, you can chain the properties:

echo '{"animal": {"dog": "Husky", "cat": "Tigre"}}' | jq '.animal.dog'

OUTPUT:

$ echo '{"animal": {"dog": "Husky", "cat": "Tigre"}}' | jq '.animal.dog'
"Husky"

If a key has spaces or special characters, we wrap the property name in quotes. If we want the value of "@odata.count":

$ echo '{"@odata.context":"$metadata#Products","@odata.count":347,"@odata.nextLink": null}' | jq '."@odata.count"'
347

If the JSON string contains an array, we iterate using .[].

echo '["mercury", "venus", "earth", "mars"]' | jq ".[]"

OUTPUT:

"mercury"
"venus"
"earth"
"mars"

If you want to print just a specific value from the JSON array, mention the index. For example, if you want to print the second value venus, the index is 1. You would do this:

echo '["mercury", "venus", "earth", "mars"]' | jq ".[1]" | tr -d '"'
venus

If the JSON string contains an array and you just want to find the length of the array:

echo '["mercury", "venus", "earth", "mars"]' | jq "length"

OUTPUT:

$ echo '["mercury", "venus", "earth", "mars"]' | jq "length"
4

Using curl to retrieve and parse a remote JSON file

A few websites contain sellers.json files to enable buyers to discover who the direct sellers or intermediaries are, in digital advertising.

Let us use Yahoo's sellers.json file located at https://www.yahoo.com/sellers.json. This is a chunk of Yahoo's sellers.json file.

{
    "contact_email": "sellersjson@yahooinc.com",
    "version": 1,
    "ext": {
        "updated": "10-17-2023"
    },
    "identifiers": [
        {
            "name": "TAG-ID",
            "value": "e1a5b5b6e3255540"
        }
    ],
    "sellers": [
        {
            "seller_id": "20459933223",
            "name": "Yahoo! US O&O",
            "domain": "yahooinc.com",
            "seller_type": "PUBLISHER"
        },
        {
            "seller_id": "20764982904",
            "name": "Y! Espanol",
            "domain": "yahooinc.com",
            "seller_type": "PUBLISHER"
        },
        {
            "seller_id": "22978591428",
            "name": "Yahoo! DE O&O",
            "domain": "yahooinc.com",
            "seller_type": "PUBLISHER"
        },
        ...
]
}

If we want to find the contact email address, we do this. It is pretty straightforward since contact_email is at root level.

$ curl -s https://www.yahoo.com/sellers.json | jq '.contact_email' | tr -d '"'
sellersjson@yahooinc.com

Now, to find information about the sellers:

$ curl -s https://www.yahoo.com/sellers.json | jq '.sellers[]' | tr -d '"'
{
  seller_id: 20459933223,
  name: Yahoo! US O&O,
  domain: yahooinc.com,
  seller_type: PUBLISHER
}
{
  seller_id: 20764982904,
  name: Y! Espanol,
  domain: yahooinc.com,
  seller_type: PUBLISHER
}
...

Let us drill this down to only the sellers' domains. That means, we will go down the hierarchy this way:

sellers[] -> domain

We will use this curl - jq command:

# curl -s https://www.yahoo.com/sellers.json | jq '.sellers[] | .domain' | tr -d '"'
yahooinc.com
yahooinc.com
yahooinc.com
...
mcclatchy.com
news.co.uk
contentiq.com
samsung.com
warnermedia.com
samsung.com
crunchyroll.com
fubo.tv
Accuweather.com
www.digitaltrends.com
cafemedia.com

If we want both the sellers' name and domain on the same line:

$ curl -s https://www.yahoo.com/sellers.json | jq -r '.sellers[] | "\(.name) \(.domain)"' | tr -d '"'
Yahoo! US O&O yahooinc.com
Y! Espanol yahooinc.com
Yahoo! DE O&O yahooinc.com
Yahoo! AR O&O yahooinc.com
...
Crunchyroll crunchyroll.com
FuboTV fubo.tv
AccuWeather Accuweather.com
Digital Trends www.digitaltrends.com
CafeMedia cafemedia.com

Get photos from NASA's Perseverance Rover

NASA has free API to get URLs for photos taken by the Mars Perseverance rover.

Mars Perseverance Rover

The JSON output from [the free NASA API Perseverance Rover](https://api.nasa.gov/mars-photos/api/v1/rovers/perseverance/photos?sol=1000&api_key=DEMO_KEY] has this format:

{
   "photos":[
      {
         "id":1210215,
         "sol":1000,
         "camera":{
            "id":38,
            "name":"NAVCAM_LEFT",
            "rover_id":8,
            "full_name":"Navigation Camera - Left"
         },
         "img_src":"https://mars.nasa.gov/mars2020-raw-images/pub/ods/surface/sol/01000/ids/edr/browse/ncam/NLF_1000_0755717269_050ECM_N0474404NCAM02000_04_195J01_1200.jpg",
         "earth_date":"2023-12-12",
         "rover":{
            "id":8,
            "name":"Perseverance",
            "landing_date":"2021-02-18",
            "launch_date":"2020-07-30",
            "status":"active",
            "max_sol":1022,
            "max_date":"2024-01-04",
            "total_photos":192971,
            "cameras":[
               {
                  "name":"EDL_RUCAM",
                  "full_name":"Rover Up-Look Camera"
               },
               {
                  "name":"EDL_DDCAM",
                  "full_name":"Descent Stage Down-Look Camera"
               },
               {
                  "name":"EDL_PUCAM1",
                  "full_name":"Parachute Up-Look Camera A"
               },
               {
                  "name":"EDL_PUCAM2",
                  "full_name":"Parachute Up-Look Camera B"
               },
               {
                  "name":"NAVCAM_LEFT",
                  "full_name":"Navigation Camera - Left"
               },
               {
                  "name":"NAVCAM_RIGHT",
                  "full_name":"Navigation Camera - Right"
               },
               {
                  "name":"MCZ_RIGHT",
                  "full_name":"Mast Camera Zoom - Right"
               },
               {
                  "name":"MCZ_LEFT",
                  "full_name":"Mast Camera Zoom - Left"
               },
               {
                  "name":"FRONT_HAZCAM_LEFT_A",
                  "full_name":"Front Hazard Avoidance Camera - Left"
               },
               {
                  "name":"FRONT_HAZCAM_RIGHT_A",
                  "full_name":"Front Hazard Avoidance Camera - Right"
               },
               {
                  "name":"REAR_HAZCAM_LEFT",
                  "full_name":"Rear Hazard Avoidance Camera - Left"
               },
               {
                  "name":"REAR_HAZCAM_RIGHT",
                  "full_name":"Rear Hazard Avoidance Camera - Right"
               },
               {
                  "name":"EDL_RDCAM",
                  "full_name":"Rover Down-Look Camera"
               },
               {
                  "name":"SKYCAM",
                  "full_name":"MEDA Skycam"
               },
               {
                  "name":"SHERLOC_WATSON",
                  "full_name":"SHERLOC WATSON Camera"
               },
               {
                  "name":"SUPERCAM_RMI",
                  "full_name":"SuperCam Remote Micro Imager"
               },
               {
                  "name":"LCAM",
                  "full_name":"Lander Vision System Camera"
               }
            ]
         }
      }
      ,
    ...
   ]
}

We can see that the attribute / key photos is an array. For the sake of space, we will include only the first element in photos. Look carefully at the JSON hash. There is a lot more information in this JSON object that you can use. We see that there is an image URL in the key img_src under this hierarchy:

photos[] -> img_src

The first value of img_src is this:

https://mars.nasa.gov/mars2020-raw-images/pub/ods/surface/sol/01000/ids/edr/browse/ncam/NLF_1000_0755717269_050ECM_N0474404NCAM02000_04_195J01_1200.jpg

Let us see if it is a valid image using the <img> tag.

NLF_1000_0755717269_050ECM_N0474404NCAM02000_04_195J01_1200.jpg

Yes, it is a beautiful photo of the Mars surface!

To capture the URL from the JSON, we can do this using curl and jq:

$ curl -s "https://api.nasa.gov/mars-photos/api/v1/rovers/perseverance/photos?sol=1000&api_key=DEMO_KEY" | jq -r '.photos[] | .img_src' | tr -d '"'
https://mars.nasa.gov/mars2020-raw-images/pub/ods/surface/sol/01000/ids/edr/browse/ncam/NLF_1000_0755717269_050ECM_N0474404NCAM02000_04_195J01_1200.jpg
https://mars.nasa.gov/mars2020-raw-images/pub/ods/surface/sol/01000/ids/edr/browse/ncam/NLF_1000_0755716641_612ECM_N0474404NCAM03000_10_195J01_1200.jpg
https://mars.nasa.gov/mars2020-raw-images/pub/ods/surface/sol/01000/ids/edr/browse/ncam/NLF_1000_0755716929_050ECM_N0474404NCAM03000_07_195J01_1200.jpg
https://mars.nasa.gov/mars2020-raw-images/pub/ods/surface/sol/01000/ids/edr/browse/ncam/NLF_1000_0755716929_050ECM_N0474404NCAM03000_10_195J01_1200.jpg
https://mars.nasa.gov/mars2020-raw-images/pub/ods/surface/sol/01000/ids/edr/browse/ncam/NLF_1000_0755717806_175ECM_N0474404NCAM12000_04_195J01_1200.jpg
...
https://mars.nasa.gov/mars2020-raw-images/pub/ods/surface/sol/01000/ids/edr/browse/scam/LRE_1000_0755699148_284ECM_N0474374SCAM02000_0010I6J01_1200.jpg
https://mars.nasa.gov/mars2020-raw-images/pub/ods/surface/sol/01000/ids/edr/browse/scam/LRE_1000_0755709186_192ECM_N0474374SCAM06000_0010I6J01_1200.jpg
https://mars.nasa.gov/mars2020-raw-images/pub/ods/surface/sol/01000/ids/edr/browse/scam/LRE_1000_0755710735_218ECM_N0474374SCAM06000_0050I6J01_1200.jpg

We just got the URLs of all the latest images taken by the Mars Rover. You can copy the image URLs paste them on a browser to view them. Or just click on any of the image links below.

image 1 image 2 image 3

Accessing a specific image in the Mars Rover JSON data

If you want the third image in the Mars JSON data, use the index. Index = 2 (always one lesser than the human count). So we search for .photos[2]

$ curl -s "https://api.nasa.gov/mars-photos/api/v1/rovers/perseverance/photos?sol=1000&api_key=DEMO_KEY" | jq -r '.photos[2] | .img_src' | tr -d '"'
https://mars.nasa.gov/mars2020-raw-images/pub/ods/surface/sol/01000/ids/edr/browse/ncam/NLF_1000_0755716929_050ECM_N0474404NCAM03000_07_195J01_1200.jpg

Accessing images in a range

If you want the 5th, 6th and 7th images in the Mars JSON data, you have to slice it. Slicing is done with .photos[4:7]

$ curl -s "https://api.nasa.gov/mars-photos/api/v1/rovers/perseverance/photos?sol=1000&api_key=DEMO_KEY" | jq  '.photos[4:7] | .[] | .img_src' | tr -d '"'
https://mars.nasa.gov/mars2020-raw-images/pub/ods/surface/sol/01000/ids/edr/browse/ncam/NLF_1000_0755717806_175ECM_N0474404NCAM12000_04_195J01_1200.jpg
https://mars.nasa.gov/mars2020-raw-images/pub/ods/surface/sol/01000/ids/edr/browse/ncam/NLF_1000_0755717720_847ECM_N0474404NCAM12000_04_195J01_1200.jpg
https://mars.nasa.gov/mars2020-raw-images/pub/ods/surface/sol/01000/ids/edr/browse/ncam/NLF_1000_0755716641_612ECM_N0474404NCAM03000_01_195J01_1200.jpg

More sellers.json

"Circling back" to sellers.json. If you want to check which websites are using MediaVine ads, you can do a curl+jq on MediaVine's sellers.json file.

$ curl -s https://www.mediavine.com/sellers.json | jq '.sellers[] | .domain' | tr -d '"'
mediavine.com
extremecouponingmom.ca
sociomix.com
gloryofthesnow.com
easysewingforbeginners.com
wild-bird-watching.com
livforcake.com
ihearteating.com
theflavorbender.com
ProRec.com
suncatcherstudio.com
happyhappynester.com
thetravelpockets.com
dinkel-und-beeren.de
frugalreality.com
aliceandlois.com

Similar to this, if you want to find which websites are hosted on Ezoic, you can do this:

$ curl -s https://www.ezoic.com/sellers.json | jq '.sellers[] | .domain' | tr -d '"'
learning-styles-online.com
aconsciousrethink.com
hymnlyrics.org
timelines.ws
stresslesscountry.com
itechtics.com
shopsleuth.com
quartzimodo.com
stickyball.net
free-sample-letter.com
puertovallarta.net
epainassist.com
poemsforfree.com

Here's an exercise. Try printing the names and domains of all publisher of NitroPay. This is their sellers.json file:

https://nitropay.com/sellers.json

Answer
$ curl -s https://nitropay.com/sellers.json | jq '.sellers[] | "\(.name) \(.domain)"' | tr -d '"'
GG Software LP ggsoftware.io
Alayton Norgard tanks.gg
Hirin Volodymyr gameplay.tips
Rasmus Kromann-Larsen poe.ninja
Chucklefish starbounder.org
EDHREC edhrec.com
Carl Yangsheng teamfortress.tv
42Bytes warframe.market
Dan Leveille dododex.com
Andrew Tsai pcgamingwiki.com
VG Resource vg-resource.com
  

Conclusion

This article will be updated with more fun stuff. Feel free to contact me if you have any questions or suggestions.

Related Posts

If you have any questions, please contact me at arulbOsutkNiqlzziyties@gNqmaizl.bkcom. You can also post questions in our Facebook group. Thank you.

Disclaimer: Our website is supported by our users. We sometimes earn affiliate links when you click through the affiliate links on our website.

Last Updated: October 10, 2024.     This post was originally written on January 06, 2024.