[concept]Data Cleaning
Missing Data
# theory
detecting missing values
In pandas, missing values are represented as NaN (Not a Number) or None.
df.isna() # DataFrame of True/False
df.isna().sum() # Count of NaN per column
df.isna().any() # True if column has any NaN
handling missing data
Option 1: Drop rows with missing values
df.dropna() # Drop any row with NaN
df.dropna(subset=["name"]) # Drop only if 'name' is NaN
df.dropna(how="all") # Drop only if ALL values are NaN
df.dropna(thresh=3) # Keep rows with at least 3 non-NaN
Option 2: Fill missing values
df.fillna(0) # Fill all NaN with 0
df.fillna({"age": 0, "city": "Unknown"}) # Different values per column
df["age"].fillna(df["age"].mean()) # Fill with mean
df.fillna(method="ffill") # Forward fill (use previous value)
df.fillna(method="bfill") # Backward fill (use next value)
checking
# Total missing values
print(df.isna().sum().sum())
# Percentage missing per column
print(df.isna().mean() * 100)
# Rows with any missing values
rows_with_na = df[df.isna().any(axis=1)]# examples [3]
# example 01 · detecting missing values
Find where NaN values exist
1
2
3
4
5
6
7
8
9
10
11
12
13
14
🐍
# example 02 · dropping missing values
Remove rows with NaN
1
2
3
4
5
6
7
8
9
10
11
12
13
14
🐍
# example 03 · filling missing values
Replace NaN with meaningful values
1
2
3
4
5
6
7
8
9
10
11
12
🐍
# challenges [2]
# challenge 01/02todo
Count the total number of missing values in the students DataFrame and print it.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
🐍
# challenge 02/02todo
Create a DataFrame with some NaN values, then fill all NaN with the string 'MISSING' and print it.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
🐍
# project
# project-challenge
thread: Survey Insights Report · reward: 50 xp
# brief
You're a tech recruiter analyzing developer survey responses. Before building your insights report, you need to verify data quality by checking for any missing values in the dataset.
# task
Check for Missing Survey Data
# your code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
🐍