JSON has become the default structured data format for most APIs and web services today. Sometimes, you just want to call a web service, that returns JSON, from the command line and print out a few specific things. Bash does not have a built-in JSON library, so we use a utility called jq
to parse the JSON text. jq is a lightweight command line JSON processor.
In this tutorial, you will learn how to read JSON input and combine with other utilities like curl
to read and parse remote JSON objects. You will learn how to retrieve images from the Mars Perseverance Rover using the free API from NASA, and lists of AdOps publishers using their JSON feeds.
Table of Contents
- Where is jq?
- jq filters
- Inline JSON string
- External JSON file
- Getting key values using the .field operator
- To get a root property value
- To remove the double quotes
- Print multiple property values
- Print multiple property values on the same line
- Print multiple property values as an array
- Print property values from a nested structure
- Print property values where the key has spaces or special characters
- Print values from a JSON array
- Using curl to retrieve and parse a remote JSON file
- Get photos from NASA's Perseverance Rover
- Accessing a specific image in the Mars Rover JSON data
- Accessing images in a range
- More sellers.json
- Conclusion
Where is jq?
jq is available on most Linux and even macOS systems. You can test it simply with jq
on the command line.
OUTPUT:
$ jq
jq - commandline JSON processor [version 1.6]
Usage: jq [options] <jq filter> [file...]
jq [options] --args <jq filter> [strings...]
jq [options] --jsonargs <jq filter> [JSON_TEXTS...]
jq is a tool for processing JSON inputs, applying the given filter to
its JSON text inputs and producing the filter's results as JSON on
standard output.
The simplest filter is ., which copies jq's input to its output
unmodified (except for formatting, but note that IEEE754 is used
for number representation internally, with all that that implies).
For more advanced filters see the jq(1) manpage ("man jq")
and/or https://stedolan.github.io/jq
...
If you get a jq not found
error, you can install it.
Debian based systems
sudo apt install jq
Red Hat based systems
sudo yum install jq
jq filters
A jq program is a filter. It takes an input and produces an output based on the filter. A few filters produce single result and others produce multiple results. jq filters run on streams of JSON data.
By combining filters, we can perform various operations and transformations to our JSON data and get the information we want.
Inline JSON string
We will use this JSON string as an input:
{"first_name": "Arul", "last_name": "John"}
We will send this JSON string as an input to the jq program and have it print all the keys "Arul" and "John". This is what we see.
$ echo '{"first_name": "Arul", "last_name": "John"}' | jq '.'
{
"first_name": "Arul",
"last_name": "John"
}
External JSON file
If we have an external JSON file, we can specify the JSON file as an argument. For example, if our JSON file is called astronaut.json and contains this:
{
"first_name": "Neil",
"last_name": "Armstrong",
"university": "Purdue University",
"time_in_space": "8 days, 14 hours, 12 minutes and 30 seconds"
}
This should print out the keys and their values:
jq '.' astronaut.json
OUTPUT:
{
"first_name": "Neil",
"last_name": "Armstrong",
"university": "Purdue University",
"time_in_space": "8 days, 14 hours, 12 minutes and 30 seconds"
}
Getting key values using the .field operator
To get a key value, we use '.field' where field is the key or property. There are several options for printing values of properties. You can print values at root level, values in an array, values in a nested structure, and so on.
To get a root property value
For example, to get the first name in this JSON string, we do this:
echo '{"first_name": "Arul", "last_name": "John"}' | jq '.first_name'
OUTPUT:
"Arul"
To remove the double quotes
To remove the annoying double quotes, you can use a tr
command:
$ echo '{"first_name": "Arul", "last_name": "John"}' | jq '.first_name' | tr -d '"'
Arul
To remove the double quotes, you can also use jq -r
instead of chaining it with tr
:
$ echo '{"first_name": "Arul", "last_name": "John"}' | jq -r '.first_name'
Arul
Print multiple property values
To print multiple values, just separate each field with a comma:
$ echo '{"first_name": "Arul", "last_name": "John"}' | jq '.first_name,.last_name' | tr -d '"'
Arul
John
By default, it prints each value on a separate line.
Print multiple property values on the same line
We noticed that when we print multiple values, they print on separate lines.
If we want to print multiple values on the same line, we use -r
and String Interpolation, like this:
$ echo '{"first_name": "Arul", "last_name": "John"}' | jq -r '"\(.first_name) \(.last_name)"'
Arul John
Another way to print multiple values on the same line is by putting the values in an array and using @tsv
. In this case, the multiple values are separated by a tab space. You get a tag-separated output.
$ echo '{"first_name": "Arul", "last_name": "John"}' | jq -r '[.first_name, .last_name] | @tsv'
Arul John
Print multiple property values as an array
To print multiple property values as an array, enclose the filter with square brackets like this:
$ echo '{"first_name": "Arul", "last_name": "John"}' | jq '[.first_name, .last_name]' | tr -d '"'
[
Arul,
John
]
Print property values from a nested structure
If the JSON string contains a nested structure, you can chain the properties:
echo '{"animal": {"dog": "Husky", "cat": "Tigre"}}' | jq '.animal.dog'
OUTPUT:
$ echo '{"animal": {"dog": "Husky", "cat": "Tigre"}}' | jq '.animal.dog'
"Husky"
Print property values where the key has spaces or special characters
If a key has spaces or special characters, we wrap the property name in quotes. If we want the value of "@odata.count":
$ echo '{"@odata.context":"$metadata#Products","@odata.count":347,"@odata.nextLink": null}' | jq '."@odata.count"'
347
Print values from a JSON array
If the JSON string contains an array, we iterate using .[]
.
echo '["mercury", "venus", "earth", "mars"]' | jq ".[]"
OUTPUT:
"mercury"
"venus"
"earth"
"mars"
If you want to print just a specific value from the JSON array, mention the index. For example, if you want to print the second value venus
, the index is 1
. You would do this:
echo '["mercury", "venus", "earth", "mars"]' | jq ".[1]" | tr -d '"'
venus
If the JSON string contains an array and you just want to find the length of the array:
echo '["mercury", "venus", "earth", "mars"]' | jq "length"
OUTPUT:
$ echo '["mercury", "venus", "earth", "mars"]' | jq "length"
4
Using curl to retrieve and parse a remote JSON file
A few websites contain sellers.json
files to enable buyers to discover who the direct sellers or intermediaries are, in digital advertising.
Let us use Yahoo's sellers.json
file located at https://www.yahoo.com/sellers.json
. This is a chunk of Yahoo's sellers.json
file.
{
"contact_email": "sellersjson@yahooinc.com",
"version": 1,
"ext": {
"updated": "10-17-2023"
},
"identifiers": [
{
"name": "TAG-ID",
"value": "e1a5b5b6e3255540"
}
],
"sellers": [
{
"seller_id": "20459933223",
"name": "Yahoo! US O&O",
"domain": "yahooinc.com",
"seller_type": "PUBLISHER"
},
{
"seller_id": "20764982904",
"name": "Y! Espanol",
"domain": "yahooinc.com",
"seller_type": "PUBLISHER"
},
{
"seller_id": "22978591428",
"name": "Yahoo! DE O&O",
"domain": "yahooinc.com",
"seller_type": "PUBLISHER"
},
...
]
}
If we want to find the contact email address, we do this. It is pretty straightforward since contact_email
is at root level.
$ curl -s https://www.yahoo.com/sellers.json | jq '.contact_email' | tr -d '"'
sellersjson@yahooinc.com
Now, to find information about the sellers:
$ curl -s https://www.yahoo.com/sellers.json | jq '.sellers[]' | tr -d '"'
{
seller_id: 20459933223,
name: Yahoo! US O&O,
domain: yahooinc.com,
seller_type: PUBLISHER
}
{
seller_id: 20764982904,
name: Y! Espanol,
domain: yahooinc.com,
seller_type: PUBLISHER
}
...
Let us drill this down to only the sellers' domains. That means, we will go down the hierarchy this way:
sellers[] -> domain
We will use this curl - jq command:
# curl -s https://www.yahoo.com/sellers.json | jq '.sellers[] | .domain' | tr -d '"'
yahooinc.com
yahooinc.com
yahooinc.com
...
mcclatchy.com
news.co.uk
contentiq.com
samsung.com
warnermedia.com
samsung.com
crunchyroll.com
fubo.tv
Accuweather.com
www.digitaltrends.com
cafemedia.com
If we want both the sellers' name
and domain
on the same line:
$ curl -s https://www.yahoo.com/sellers.json | jq -r '.sellers[] | "\(.name) \(.domain)"' | tr -d '"'
Yahoo! US O&O yahooinc.com
Y! Espanol yahooinc.com
Yahoo! DE O&O yahooinc.com
Yahoo! AR O&O yahooinc.com
...
Crunchyroll crunchyroll.com
FuboTV fubo.tv
AccuWeather Accuweather.com
Digital Trends www.digitaltrends.com
CafeMedia cafemedia.com
Get photos from NASA's Perseverance Rover
NASA has free API to get URLs for photos taken by the Mars Perseverance rover.
The JSON output from [the free NASA API Perseverance Rover](https://api.nasa.gov/mars-photos/api/v1/rovers/perseverance/photos?sol=1000&api_key=DEMO_KEY] has this format:
{
"photos":[
{
"id":1210215,
"sol":1000,
"camera":{
"id":38,
"name":"NAVCAM_LEFT",
"rover_id":8,
"full_name":"Navigation Camera - Left"
},
"img_src":"https://mars.nasa.gov/mars2020-raw-images/pub/ods/surface/sol/01000/ids/edr/browse/ncam/NLF_1000_0755717269_050ECM_N0474404NCAM02000_04_195J01_1200.jpg",
"earth_date":"2023-12-12",
"rover":{
"id":8,
"name":"Perseverance",
"landing_date":"2021-02-18",
"launch_date":"2020-07-30",
"status":"active",
"max_sol":1022,
"max_date":"2024-01-04",
"total_photos":192971,
"cameras":[
{
"name":"EDL_RUCAM",
"full_name":"Rover Up-Look Camera"
},
{
"name":"EDL_DDCAM",
"full_name":"Descent Stage Down-Look Camera"
},
{
"name":"EDL_PUCAM1",
"full_name":"Parachute Up-Look Camera A"
},
{
"name":"EDL_PUCAM2",
"full_name":"Parachute Up-Look Camera B"
},
{
"name":"NAVCAM_LEFT",
"full_name":"Navigation Camera - Left"
},
{
"name":"NAVCAM_RIGHT",
"full_name":"Navigation Camera - Right"
},
{
"name":"MCZ_RIGHT",
"full_name":"Mast Camera Zoom - Right"
},
{
"name":"MCZ_LEFT",
"full_name":"Mast Camera Zoom - Left"
},
{
"name":"FRONT_HAZCAM_LEFT_A",
"full_name":"Front Hazard Avoidance Camera - Left"
},
{
"name":"FRONT_HAZCAM_RIGHT_A",
"full_name":"Front Hazard Avoidance Camera - Right"
},
{
"name":"REAR_HAZCAM_LEFT",
"full_name":"Rear Hazard Avoidance Camera - Left"
},
{
"name":"REAR_HAZCAM_RIGHT",
"full_name":"Rear Hazard Avoidance Camera - Right"
},
{
"name":"EDL_RDCAM",
"full_name":"Rover Down-Look Camera"
},
{
"name":"SKYCAM",
"full_name":"MEDA Skycam"
},
{
"name":"SHERLOC_WATSON",
"full_name":"SHERLOC WATSON Camera"
},
{
"name":"SUPERCAM_RMI",
"full_name":"SuperCam Remote Micro Imager"
},
{
"name":"LCAM",
"full_name":"Lander Vision System Camera"
}
]
}
}
,
...
]
}
We can see that the attribute / key photos
is an array. For the sake of space, we will include only the first element in photos
. Look carefully at the JSON hash. There is a lot more information in this JSON object that you can use. We see that there is an image URL in the key img_src
under this hierarchy:
photos[] -> img_src
The first value of img_src is this:
https://mars.nasa.gov/mars2020-raw-images/pub/ods/surface/sol/01000/ids/edr/browse/ncam/NLF_1000_0755717269_050ECM_N0474404NCAM02000_04_195J01_1200.jpg
Let us see if it is a valid image using the <img>
tag.
Yes, it is a beautiful photo of the Mars surface!
To capture the URL from the JSON, we can do this using curl
and jq
:
$ curl -s "https://api.nasa.gov/mars-photos/api/v1/rovers/perseverance/photos?sol=1000&api_key=DEMO_KEY" | jq -r '.photos[] | .img_src' | tr -d '"'
https://mars.nasa.gov/mars2020-raw-images/pub/ods/surface/sol/01000/ids/edr/browse/ncam/NLF_1000_0755717269_050ECM_N0474404NCAM02000_04_195J01_1200.jpg
https://mars.nasa.gov/mars2020-raw-images/pub/ods/surface/sol/01000/ids/edr/browse/ncam/NLF_1000_0755716641_612ECM_N0474404NCAM03000_10_195J01_1200.jpg
https://mars.nasa.gov/mars2020-raw-images/pub/ods/surface/sol/01000/ids/edr/browse/ncam/NLF_1000_0755716929_050ECM_N0474404NCAM03000_07_195J01_1200.jpg
https://mars.nasa.gov/mars2020-raw-images/pub/ods/surface/sol/01000/ids/edr/browse/ncam/NLF_1000_0755716929_050ECM_N0474404NCAM03000_10_195J01_1200.jpg
https://mars.nasa.gov/mars2020-raw-images/pub/ods/surface/sol/01000/ids/edr/browse/ncam/NLF_1000_0755717806_175ECM_N0474404NCAM12000_04_195J01_1200.jpg
...
https://mars.nasa.gov/mars2020-raw-images/pub/ods/surface/sol/01000/ids/edr/browse/scam/LRE_1000_0755699148_284ECM_N0474374SCAM02000_0010I6J01_1200.jpg
https://mars.nasa.gov/mars2020-raw-images/pub/ods/surface/sol/01000/ids/edr/browse/scam/LRE_1000_0755709186_192ECM_N0474374SCAM06000_0010I6J01_1200.jpg
https://mars.nasa.gov/mars2020-raw-images/pub/ods/surface/sol/01000/ids/edr/browse/scam/LRE_1000_0755710735_218ECM_N0474374SCAM06000_0050I6J01_1200.jpg
We just got the URLs of all the latest images taken by the Mars Rover. You can copy the image URLs paste them on a browser to view them. Or just click on any of the image links below.
Accessing a specific image in the Mars Rover JSON data
If you want the third image in the Mars JSON data, use the index. Index = 2 (always one lesser than the human count). So we search for .photos[2]
$ curl -s "https://api.nasa.gov/mars-photos/api/v1/rovers/perseverance/photos?sol=1000&api_key=DEMO_KEY" | jq -r '.photos[2] | .img_src' | tr -d '"'
https://mars.nasa.gov/mars2020-raw-images/pub/ods/surface/sol/01000/ids/edr/browse/ncam/NLF_1000_0755716929_050ECM_N0474404NCAM03000_07_195J01_1200.jpg
Accessing images in a range
If you want the 5th, 6th and 7th images in the Mars JSON data, you have to slice it. Slicing is done with .photos[4:7]
$ curl -s "https://api.nasa.gov/mars-photos/api/v1/rovers/perseverance/photos?sol=1000&api_key=DEMO_KEY" | jq '.photos[4:7] | .[] | .img_src' | tr -d '"'
https://mars.nasa.gov/mars2020-raw-images/pub/ods/surface/sol/01000/ids/edr/browse/ncam/NLF_1000_0755717806_175ECM_N0474404NCAM12000_04_195J01_1200.jpg
https://mars.nasa.gov/mars2020-raw-images/pub/ods/surface/sol/01000/ids/edr/browse/ncam/NLF_1000_0755717720_847ECM_N0474404NCAM12000_04_195J01_1200.jpg
https://mars.nasa.gov/mars2020-raw-images/pub/ods/surface/sol/01000/ids/edr/browse/ncam/NLF_1000_0755716641_612ECM_N0474404NCAM03000_01_195J01_1200.jpg
More sellers.json
"Circling back" to sellers.json. If you want to check which websites are using MediaVine ads, you can do a curl+jq on MediaVine's sellers.json
file.
$ curl -s https://www.mediavine.com/sellers.json | jq '.sellers[] | .domain' | tr -d '"'
mediavine.com
extremecouponingmom.ca
sociomix.com
gloryofthesnow.com
easysewingforbeginners.com
wild-bird-watching.com
livforcake.com
ihearteating.com
theflavorbender.com
ProRec.com
suncatcherstudio.com
happyhappynester.com
thetravelpockets.com
dinkel-und-beeren.de
frugalreality.com
aliceandlois.com
Similar to this, if you want to find which websites are hosted on Ezoic, you can do this:
$ curl -s https://www.ezoic.com/sellers.json | jq '.sellers[] | .domain' | tr -d '"'
learning-styles-online.com
aconsciousrethink.com
hymnlyrics.org
timelines.ws
stresslesscountry.com
itechtics.com
shopsleuth.com
quartzimodo.com
stickyball.net
free-sample-letter.com
puertovallarta.net
epainassist.com
poemsforfree.com
Here's an exercise. Try printing the names and domains of all publisher of NitroPay. This is their sellers.json
file:
https://nitropay.com/sellers.json
Answer
$ curl -s https://nitropay.com/sellers.json | jq '.sellers[] | "\(.name) \(.domain)"' | tr -d '"' GG Software LP ggsoftware.io Alayton Norgard tanks.gg Hirin Volodymyr gameplay.tips Rasmus Kromann-Larsen poe.ninja Chucklefish starbounder.org EDHREC edhrec.com Carl Yangsheng teamfortress.tv 42Bytes warframe.market Dan Leveille dododex.com Andrew Tsai pcgamingwiki.com VG Resource vg-resource.com
Conclusion
This article will be updated with more fun stuff. Feel free to contact me if you have any questions or suggestions.
Related Posts
If you have any questions, please contact me at arulbOsutkNiqlzziyties@gNqmaizl.bkcom. You can also post questions in our Facebook group. Thank you.