[practice]Web & APIs
JSON from APIs
# theory
real network
Lesson 26 introduced pyfetch. This lesson uses it for actual work: hitting a public REST API, getting JSON, turning it into a DataFrame.
Every example below makes a real network request. The test API is https://jsonplaceholder.typicode.com, a free CORS-friendly stand-in for production APIs. Same shape (users, posts, comments, todos), no auth required.
the shape you'll keep writing
from pyodide.http import pyfetch
import pandas as pd
response = await pyfetch("https://jsonplaceholder.typicode.com/users")
if response.status != 200:
raise RuntimeError(f"API returned {response.status}")
data = await response.json()
df = pd.DataFrame(data)
Four steps. pyfetch → check status → .json() → pd.DataFrame. That sequence covers 80% of real API work.
picking columns
A typical API returns way more fields than you need. Strip down right away:
df = pd.DataFrame(data)[["id", "name", "email"]]
If the response is nested (a list of dicts where one value is itself a dict), use pd.json_normalize to flatten:
from pandas import json_normalize
# {"users": [{"id": 1, "address": {"city": "Boston"}}, ...]}
df = json_normalize(data["users"])
# Yields columns: id, address.city, address.zipcode, ...
counting by foreign key
A common API has two endpoints that join on an id. JsonPlaceholder gives you /posts (with a userId field) and /users (with that user's name). Fetch both, merge, aggregate.
posts = pd.DataFrame(await (await pyfetch(".../posts")).json())
users = pd.DataFrame(await (await pyfetch(".../users")).json())
joined = posts.merge(users, left_on="userId", right_on="id")
posts_per_user = joined.groupby("name").size()
handling non-200
Always read response.status before .json(). A 404 or 500 page might still have a JSON body, but it won't have the shape you're expecting.
resp = await pyfetch(".../users/9999") # doesn't exist
if resp.status == 404:
print("user not found")
elif resp.status >= 500:
print("server error, retry later")
else:
user = await resp.json()# examples [3]
GET a single user by id. Check status first, then parse JSON.
Fetch posts and users, join on userId, count posts per user. Real two-table API work in 6 lines.
Always check response.status before .json(). Missing resources shouldn't crash the script.
# challenges [2]
# project
# project-challenge
thread: Sales Performance Dashboard · reward: 50 xp
# brief
The analytics API returns a nested JSON report with sales grouped by region. Parse this structure to create a flat DataFrame showing regional performance.
# task