pyodide: loading…

[practice]Data Cleaning

Renaming & Dropping

# theory

renaming

# Rename specific columns
df.rename(columns={"old_name": "new_name"})

# Rename multiple columns
df.rename(columns={
    "firstName": "first_name",
    "lastName": "last_name"
})

# Rename all columns at once
df.columns = ["col1", "col2", "col3"]

# Apply function to all column names
df.columns = df.columns.str.lower()
df.columns = df.columns.str.replace(" ", "_")

dropping columns

# Drop single column
df.drop(columns=["unwanted"])

# Drop multiple columns
df.drop(columns=["col1", "col2"])

# Drop by position
df.drop(df.columns[0], axis=1)  # First column

dropping rows

# Drop by index
df.drop(index=[0, 1, 2])

# Drop by condition (opposite of filtering)
df = df[df["age"] >= 18]  # Keep only adults

# Drop using drop()
df.drop(df[df["score"] < 50].index)

reordering

# Specify exact order
df = df[["name", "age", "city", "score"]]

# Move specific column to front
cols = ["important_col"] + [c for c in df.columns if c != "important_col"]
df = df[cols]

in-place

By default, these return a new DataFrame. Use inplace=True to modify the original:

df.rename(columns={"old": "new"}, inplace=True)
df.drop(columns=["unwanted"], inplace=True)

Or just reassign: df = df.drop(...)

# examples [3]

# example 01 · renaming columns

Change column names for clarity

1
2
3
4
5
6
7
🐍
Loading PythonSetting up pandas & numpy...
# example 02 · dropping columns

Remove columns you don't need

1
2
3
4
5
6
7
🐍
Loading PythonSetting up pandas & numpy...
# example 03 · reordering columns

Change the order columns appear

1
2
3
4
5
🐍
Loading PythonSetting up pandas & numpy...

# challenges [2]

# challenge 01/02todo
Rename the 'score' column to 'test_score' in students and print the column names.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
🐍
Loading PythonSetting up pandas & numpy...
# challenge 02/02todo
Drop the 'date' column from sales and print the remaining columns.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
🐍
Loading PythonSetting up pandas & numpy...

# project

# project-challenge

thread: Survey Insights Report · reward: 50 xp

# brief

Your data pipeline requires snake_case column names. Rename the survey columns from camelCase (like YearsExperience) to snake_case (years_experience) for consistency with your database schema.

# task

Rename Columns to Snake Case

# your code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
🐍
Loading PythonSetting up pandas & numpy...