Pandas

Data Structures in Pandas

1. Series

A one-dimensional labeled array (like a column in Excel).

import pandas as pd

# Create Series from list
temps = [98.6, 99.1, 97.9]
patients = pd.Series(temps, index=['Alice', 'Bob', 'Charlie'])
print(patients)

Key Points:

Fast for single-column operations
Used when you need simple labeled data
Behaves like a Python dictionary with superpowers

2. DataFrame

A two-dimensional table (like a whole Excel sheet).

# Create DataFrame from dictionary
data = {
'Name': ['Alice', 'Bob', 'Charlie'],
'Temperature': [98.6, 99.1, 97.9],
'Sick': [False, True, False]
}

df = pd.DataFrame(data)
print(df)

Key Points:

Most commonly used pandas object
Fast for column operations (slower row-by-row)
Used for 90% of data tasks in Python

Creation Methods

You can create these from:

Python lists/dictionaries
Numpy arrays
CSV/Excel files
SQL databases

# From CSV (very common)
df = pd.read_csv('data.csv')

# From list of lists
data = [[1, 'A'], [2, 'B']]
df = pd.DataFrame(data, columns=['Number', 'Letter'])

Speed Tips

Vectorized operations are fastest
Avoid looping row-by-row
Use .apply() when needed

When to Use

Series: Single measurements, time series
DataFrame: Most real-world data (tables, spreadsheets)

Docs

Previous Page

Reading Data from Different Sources in Pandas

Next Page

On this page

1. Series 2. DataFrame Creation Methods Speed Tips When to Use