pyodide: loading…

[concept]Data Manipulation (WCTC)

Dates and Times

# theory

pd.to_datetime

Pandas doesn't automatically recognize dates. You gotta tell it.

df["date"] = pd.to_datetime(df["date"])

It's pretty smart about formats; it can figure out "2024-01-15", "01/15/2024", "Jan 15, 2024" and more. But if it's something weird, specify the format:

# For dates like "15-01-2024"
df["date"] = pd.to_datetime(df["date"], format="%d-%m-%Y")

the .dt accessor

Just like .str for strings, .dt gives you access to date parts.

df["date"].dt.year       # 2024
df["date"].dt.month      # 1-12
df["date"].dt.day        # 1-31
df["date"].dt.weekday    # 0=Monday, 6=Sunday
df["date"].dt.day_name() # "Monday", "Tuesday", etc
df["date"].dt.month_name() # "January", "February", etc

dt.weekday returns 0 for Monday, which is the part that's easy to forget.

date arithmetic

You can do math with dates. It's actually pretty intuitive.

from datetime import timedelta

# Add 7 days
df["date"] + timedelta(days=7)

# Days between two dates
df["end_date"] - df["start_date"]

# This gives you a Timedelta, get the days with .dt.days
(df["end_date"] - df["start_date"]).dt.days

Timedelta offsets

from datetime import timedelta

timedelta(days=30)
timedelta(hours=2)
timedelta(weeks=1)
timedelta(days=1, hours=12)  # 1.5 days

Or use pandas offsets for business logic:

import pandas as pd

pd.DateOffset(months=1)   # add exactly 1 month
pd.DateOffset(years=1)    # add exactly 1 year

strftime

When you need dates as strings in a specific format:

df["date"].dt.strftime("%Y-%m-%d")    # 2024-01-15
df["date"].dt.strftime("%m/%d/%Y")    # 01/15/2024
df["date"].dt.strftime("%B %d, %Y")   # January 15, 2024
df["date"].dt.strftime("%A")          # Monday

Common format codes:

  • %Y; 4-digit year
  • %m; 2-digit month
  • %d; 2-digit day
  • %B; full month name
  • %A; full day name
  • %H:%M:%S; time

filtering by date

# Filter to a specific month
jan_data = df[df["date"].dt.month == 1]

# Filter to a date range
mask = (df["date"] >= "2024-01-01") & (df["date"] < "2024-02-01")
jan_data = df[mask]

# Filter to weekdays only
weekdays = df[df["date"].dt.weekday < 5]

# examples [3]

# example 01 · extracting date components

Pull out year, month, day, weekday from dates

1
2
3
4
5
6
7
8
9
10
11
12
13
14
🐍
Loading PythonSetting up pandas & numpy...
# example 02 · date arithmetic

Calculate differences between dates and add offsets

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
🐍
Loading PythonSetting up pandas & numpy...
# example 03 · formatting for display

Convert dates to nice readable strings

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
🐍
Loading PythonSetting up pandas & numpy...

# challenges [2]

# challenge 01/02todo
Convert the sales 'date' column to datetime, extract the month, and print the unique months.
1
2
3
4
5
6
7
8
9
10
11
12
🐍
Loading PythonSetting up pandas & numpy...
# challenge 02/02todo
Add a new column 'day_name' to sales showing the day of week (Monday, Tuesday, etc.) and print the first 3 rows.
1
2
3
4
5
6
7
8
9
10
11
12
🐍
Loading PythonSetting up pandas & numpy...