Goal - Human Readable, Machine Parsable
Specification: https://www.json.org/
JSON — short for JavaScript Object Notation — format for sharing data.
JSON is derived from the JavaScript programming language
Available for use by many languages including Python
usually file extension is .json when stored
# Sample JSON below from https://json.org/example.html
# Question why is Syntax highlighting working properly ? :)
{"widget": {
"debug": "on",
"window": {
"title": "Sample Konfabulator Widget",
"name": "main_window",
"width": 500,
"height": 500
},
"image": {
"src": "Images/Sun.png",
"name": "sun1",
"hOffset": 250,
"vOffset": 250,
"alignment": "center"
},
"text": {
"data": "Click Here",
"size": 36,
"style": "bold",
"name": "text1",
"hOffset": 250,
"vOffset": 100,
"alignment": "center",
"onMouseUp": "sun1.opacity = (sun1.opacity / 100) * 90;"
}
}}
{'widget': {'debug': 'on', 'image': {'alignment': 'center', 'hOffset': 250, 'name': 'sun1', 'src': 'Images/Sun.png', 'vOffset': 250}, 'text': {'alignment': 'center', 'data': 'Click Here', 'hOffset': 250, 'name': 'text1', 'onMouseUp': 'sun1.opacity = (sun1.opacity / 100) * 90;', 'size': 36, 'style': 'bold', 'vOffset': 100}, 'window': {'height': 500, 'name': 'main_window', 'title': 'Sample Konfabulator Widget', 'width': 500}}}
# if this was string starting with { it would be our json
mydata = {
"firstName": "Jane",
"lastName": "Doe",
"hobbies": ["running", "sky diving", "dancing"],
"age": 43,
"children": [
{
"firstName": "Alice",
"age": 7
},
{
"firstName": "Bob",
"age": 13
}
]
}
type(mydata)
dict
mydata
{'age': 43, 'children': [{'age': 7, 'firstName': 'Alice'}, {'age': 13, 'firstName': 'Bob'}], 'firstName': 'Jane', 'hobbies': ['running', 'sky diving', 'dancing'], 'lastName': 'Doe'}
The process of encoding JSON is usually called serialization. This term refers to the transformation of data into a series of bytes (hence serial) to be stored or transmitted across a network. You may also hear the term marshaling, but that’s a whole other discussion. Naturally, deserialization is the reciprocal process of decoding data that has been stored or delivered in the JSON standard.
All we’re talking about here is reading and writing. Think of it like this: encoding is for writing data to disk, while decoding is for reading data into memory. https://realpython.com/python-json/
import json
with open("data_file.json", "w") as write_file:
json.dump(mydata, write_file)
# use json string in our program
json_string = json.dumps(mydata)
json_string
'{"firstName": "Jane", "lastName": "Doe", "hobbies": ["running", "sky diving", "dancing"], "age": 43, "children": [{"firstName": "Alice", "age": 7}, {"firstName": "Bob", "age": 13}]}'
type(json_string)
str
# Avove example JSON and Python object have the same syntax but there are some differences
Simple Python objects are translated to JSON according to a fairly intuitive conversion.
Python JSON
dict object
list, tuple array
str string
int, long,
float number
True true
False false
None null
# The first option most people want to change is whitespace. You can use the indent keyword argument to specify the indentation size for nested structures. Check out the difference for yourself by using data, which we defined above, and running the following commands in a console:
json.dumps(mydata)
'{"firstName": "Jane", "lastName": "Doe", "hobbies": ["running", "sky diving", "dancing"], "age": 43, "children": [{"firstName": "Alice", "age": 7}, {"firstName": "Bob", "age": 13}]}'
# very useful for visibility!
print(json.dumps(mydata, indent=4))
{ "firstName": "Jane", "lastName": "Doe", "hobbies": [ "running", "sky diving", "dancing" ], "age": 43, "children": [ { "firstName": "Alice", "age": 7 }, { "firstName": "Bob", "age": 13 } ] }
with open("data_file.json", "w") as write_file:
json.dump(mydata, write_file, indent=4)
with open("data_file.json", "r") as read_file:
data = json.load(read_file)
data
{'age': 43, 'children': [{'age': 7, 'firstName': 'Alice'}, {'age': 13, 'firstName': 'Bob'}], 'firstName': 'Jane', 'hobbies': ['running', 'sky diving', 'dancing'], 'lastName': 'Doe'}
type(data)
dict
Keep in mind that the result of this method could return any of the allowed data types from the conversion table. This is only important if you’re loading in data you haven’t seen before. In most cases, the root object will be a dict or a list.
If you've gotten JSON data in from another program or have otherwise obtained a string of JSON formatted data in Python, you can easily deserialize that with loads(), which naturally loads from a string:
json_string = """
{
"researcher": {
"name": "Ford Prefect",
"species": "Betelgeusian",
"relatives": [
{
"name": "Zaphod Beeblebrox",
"species": "Betelgeusian"
}
]
}
}
"""
data = json.loads(json_string)
data
{'researcher': {'name': 'Ford Prefect', 'relatives': [{'name': 'Zaphod Beeblebrox', 'species': 'Betelgeusian'}], 'species': 'Betelgeusian'}}
type(data)
dict
data['researcher']['relatives'][0]['name']
'Zaphod Beeblebrox'
import json
import requests
## Lets get some data https://jsonplaceholder.typicode.com/
response = requests.get("https://jsonplaceholder.typicode.com/todos")
todos = json.loads(response.text)
can open https://jsonplaceholder.typicode.com/todos in regular browser too..
type(todos)
list
len(todos)
200
todos[:10]
[{'completed': False, 'id': 1, 'title': 'delectus aut autem', 'userId': 1}, {'completed': False, 'id': 2, 'title': 'quis ut nam facilis et officia qui', 'userId': 1}, {'completed': False, 'id': 3, 'title': 'fugiat veniam minus', 'userId': 1}, {'completed': True, 'id': 4, 'title': 'et porro tempora', 'userId': 1}, {'completed': False, 'id': 5, 'title': 'laboriosam mollitia et enim quasi adipisci quia provident illum', 'userId': 1}, {'completed': False, 'id': 6, 'title': 'qui ullam ratione quibusdam voluptatem quia omnis', 'userId': 1}, {'completed': False, 'id': 7, 'title': 'illo expedita consequatur quia in', 'userId': 1}, {'completed': True, 'id': 8, 'title': 'quo adipisci enim quam ut ab', 'userId': 1}, {'completed': False, 'id': 9, 'title': 'molestiae perspiciatis ipsa', 'userId': 1}, {'completed': True, 'id': 10, 'title': 'illo est ratione doloremque quia maiores aut', 'userId': 1}]
myl = [('Valdis', 40), ('Alice',35), ('Bob', 23),('Carol',70)]
# Lambda = anonymous function
def myfun(el):
return el[1]
# same as myfun = lambda el: el[1]
sorted(myl, key = lambda el: el[1], reverse=True)
[('Carol', 70), ('Valdis', 40), ('Alice', 35), ('Bob', 23)]
# Exercise find out top 3 users with most tasks completed!
# TIPS
# we need some sort of structure to store these user results before finding out top 3
# at least two good data structure choices here :)
# here the simplest might actually be the best if we consider userId values
users = [ el['userId'] for el in todos]
len(users),users[:15]
(200, [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1])
uniqusers = set(users)
uniqusers
{1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
# dictionary comprehension but could live without one
users = { el['userId'] : 0 for el in todos}
users
{1: 0, 2: 0, 3: 0, 4: 0, 5: 0, 6: 0, 7: 0, 8: 0, 9: 0, 10: 0}
users.keys()
dict_keys([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
users.value
#{'completed': True,
# 'id': 8,
# 'title': 'quo adipisci enim quam ut ab',
# 'userId': 1}
#idiomatic
for el in todos:
users[el['userId']] += el['completed'] # Boolean False is 0 True is 1 obviously this might not be too readable
# same as above could be useful in more complicated cases
for el in todos:
if el['completed'] == True:
users[el['userId']] += 1
# there could be a one liner or a solution with from collections import Counter
users.items()
dict_items([(1, 11), (2, 8), (3, 7), (4, 6), (5, 12), (6, 6), (7, 9), (8, 11), (9, 8), (10, 12)])
list(users.items())
[(1, 11), (2, 8), (3, 7), (4, 6), (5, 12), (6, 6), (7, 9), (8, 11), (9, 8), (10, 12)]
userlist=list(users.items())
type(userlist[0])
tuple
# we pass a key anonymous(lambda) function
sorted(userlist, key=lambda el: el[1], reverse=True)[:3]
[(5, 12), (10, 12), (1, 11)]
# lets try a simple way
mylist=[0]
mylist*=11
for el in todos:
if el['completed'] == True:
mylist[el['userId']] +=1
mylist
[0, 11, 8, 7, 6, 12, 6, 9, 11, 8, 12]
mylist.index(max(mylist))
5
# kind of hard to get more values need to get tricky
#