Pivot Tables

Pivot tables are a powerful tool for summarizing and reshaping data. In pandas, the pivot_table() function provides this functionality.

Creating a Simple Pivot Table

You can create a pivot table to aggregate data from a DataFrame. You need to specify the index, columns, and values you want to use.

import pandas as pd
import numpy as np
df = pd.DataFrame({'A': ['foo', 'foo', 'foo', 'bar', 'bar', 'bar'],
                 'B': ['one', 'one', 'two', 'two', 'one', 'one'],
                 'C': np.random.randn(6),
                 'D': np.random.randn(6)})

# Create a pivot table with 'A' as index, 'B' as columns, and 'C' as values
pivot = pd.pivot_table(df, values='C', index=['A'], columns=['B'])
print(pivot)

Aggregate Functions

The default aggregation function is mean. You can specify a different function or multiple functions using the aggfunc parameter.

Function	Description
`'mean'`	Average of values
`'sum'`	Sum of values
`'count'`	Count of values
`'min'`	Minimum value
`'max'`	Maximum value
`'std'`	Standard deviation
`'var'`	Variance

import pandas as pd
import numpy as np
df = pd.DataFrame({'A': ['foo', 'foo', 'foo', 'bar', 'bar', 'bar'],
                 'B': ['one', 'one', 'two', 'two', 'one', 'one'],
                 'C': np.random.randn(6),
                 'D': np.random.randn(6)})

# Create a pivot table with sum as the aggregation function
pivot_sum = pd.pivot_table(df, values='C', index=['A'], columns=['B'], aggfunc='sum')
print("Pivot with sum:\n", pivot_sum)

# Use multiple aggregation functions
pivot_multi = pd.pivot_table(df, values='D', index=['A', 'B'], aggfunc=['mean', 'sum'])
print("\nPivot with multiple functions:\n", pivot_multi)

Handling Missing Values

The pivot_table() function allows you to fill missing values using the fill_value parameter.

import pandas as pd
import numpy as np
df = pd.DataFrame({'A': ['foo', 'foo', 'foo', 'bar'],
                 'B': ['one', 'two', 'two', 'one'],
                 'C': np.random.randn(4),
                 'D': np.random.randn(4)})

# Pivot table will have NaN where combinations don't exist
pivot = pd.pivot_table(df, values='C', index=['A'], columns=['B'])
print("Pivot with NaN:\n", pivot)

# Fill missing values with 0
pivot_filled = pd.pivot_table(df, values='C', index=['A'], columns=['B'], fill_value=0)
print("\nPivot with filled values:\n", pivot_filled)

Pivot Tables

Creating a Simple Pivot Table

Aggregate Functions

Handling Missing Values

On this page