pyodide: loading…

[concept]Grouping & Combining

Fixed-Width Files

# theory

fixed-width files

Unlike CSVs (comma-separated), fixed-width files use character positions. Each field occupies a set number of characters:

NAMEAGECITY Alice 25NYC Bob 30LA Carol 28CHI

Name is characters 0-5, Age is 6-7, City is 8-11.

pd.read_fwf()

# Auto-detect column widths (sometimes works)
df = pd.read_fwf("data.txt")

# Specify column positions manually
df = pd.read_fwf("data.txt",
    colspecs=[(0, 6), (6, 8), (8, 12)],
    names=["name", "age", "city"])

# Using widths (simpler if columns are evenly spaced)
df = pd.read_fwf("data.txt",
    widths=[6, 2, 4],
    names=["name", "age", "city"])

colspecs

List of (start, end) tuples. Character positions are 0-indexed:

colspecs = [
    (0, 10),    # Characters 0-9 (10 chars)
    (10, 15),   # Characters 10-14 (5 chars)
    (15, 25)    # Characters 15-24 (10 chars)
]

common parameters

pd.read_fwf(filepath,
    colspecs=[(0,5), (5,10)],  # Column positions
    widths=[5, 5],              # Alternative: column widths
    names=["col1", "col2"],     # Column names
    skiprows=1,                 # Skip header row
    na_values=["", "  "],       # Treat as missing
    dtype={"zip": str}          # Force data types
)

# examples [3]

# example 01 · reading fixed-width data

Parse data with specific column positions

1
2
3
4
5
6
7
8
9
10
11
12
13
14
🐍
Loading PythonSetting up pandas & numpy...
# example 02 · using widths instead of colspecs

Simpler syntax when you know column widths

1
2
3
4
5
6
7
8
9
10
11
12
13
🐍
Loading PythonSetting up pandas & numpy...
# example 03 · auto-detect columns

Let pandas figure out the columns

1
2
3
4
5
6
7
8
9
10
🐍
Loading PythonSetting up pandas & numpy...

# challenges [2]

# challenge 01/02todo
Read this fixed-width string where Product is chars 0-10, Price is 10-16, Qty is 16-20. Print the result.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
🐍
Loading PythonSetting up pandas & numpy...
# challenge 02/02todo
Read fixed-width data using widths=[5, 3, 8] for ID, Age, and City columns.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
🐍
Loading PythonSetting up pandas & numpy...

# project

# project-challenge

thread: Survey Insights Report · reward: 50 xp

# brief

Your company's legacy HR system exports employee data in fixed-width format. Parse this sample record to integrate historical employee data with your modern survey analysis.

# task

Parse Legacy HR System Export

# your code
1
2
3
4
5
6
7
8
9
10
11
🐍
Loading PythonSetting up pandas & numpy...