[practice]Grouping & Combining
Multi-Column GroupBy
# theory
grouping by multiple columns
df.groupby(["region", "category"])["sales"].sum()
This creates a hierarchical index with all combinations.
multi-aggregations on multiple columns
df.groupby("category").agg({
"price": "mean",
"quantity": "sum",
"date": "count"
})
named aggregations
df.groupby("category").agg(
avg_price=("price", "mean"),
total_qty=("quantity", "sum"),
num_orders=("date", "count")
)
This gives you descriptive column names in the result.
multi-index results
# After multi-column groupby
result = df.groupby(["region", "category"])["sales"].sum()
# Reset to flat DataFrame
flat = result.reset_index()
# Or unstack for pivot-style view
pivoted = result.unstack()
size vs count
df.groupby("category").size() # Count all rows
df.groupby("category").count() # Count non-NaN per column# examples [3]
# example 01 · multi-column GroupBy
Group by two columns at once
1
2
3
4
5
6
🐍
# example 02 · named aggregations
Give your aggregated columns meaningful names
1
2
3
4
5
6
🐍
# example 03 · dictionary aggregation
Different aggregations for different columns
1
2
3
4
5
6
🐍
# challenges [2]
# challenge 01/02todo
Group students by both 'grade' and 'subject', count students in each group, and print the result.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
🐍
# challenge 02/02todo
Calculate both the mean and max score by subject using named aggregations.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
🐍
# project
# project-challenge
thread: Survey Insights Report · reward: 50 xp
# brief
You need to understand programming language trends across regions. Group the survey data by Country and LanguageUsed to count how many respondents use each language in each country.
# task
Language Popularity by Country
# your code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
🐍