MLP FU
Pandas

Applying Functions

Pandas provides flexible ways to apply functions to your data. The most common methods are apply(), map(), and applymap().

map() Method

The map() method is used with Series to substitute each value with another value. It's useful for mapping a set of values to another set.

import pandas as pd
import numpy as np

s = pd.Series(['cat', 'dog', np.nan, 'rabbit'])
print("Original Series:")
print(s)

# Map values to a dictionary
print("\nMapped Series:")
print(s.map({'cat': 'kitten', 'dog': 'puppy'}))

You can also use map with a function.

import pandas as pd

s = pd.Series([1, 2, 3, 4])
print("Original Series:")
print(s)

# Map values using a function
print("\nMapped Series with a function:")
print(s.map(lambda x: x * 2))

apply() Method

The apply() method works on both Series and DataFrames. When used on a DataFrame, it applies a function along an axis (either rows or columns).

Applying to Columns

Here is an example of applying a function to each column.

import pandas as pd
import numpy as np

df = pd.DataFrame([[4, 9]] * 3, columns=['A', 'B'])
print("Original DataFrame:")
print(df)

# Apply numpy's sqrt function to each column
print("\nDataFrame after applying sqrt:")
print(df.apply(np.sqrt))

Applying to Rows

You can also apply a function to each row by specifying axis=1.

import pandas as pd
import numpy as np

df = pd.DataFrame([[1, 2, 3], [4, 5, 6]], columns=['A', 'B', 'C'])
print("Original DataFrame:")
print(df)

# Get the sum of each row
print("\nSum of each row:")
print(df.apply(np.sum, axis=1))

Applying to Series

When used on a Series, apply() operates on each element.

import pandas as pd

s = pd.Series([1, 2, 3, 4])
print("Original Series:")
print(s)

# Apply a function to each element
print("\nApplied Series:")
print(s.apply(lambda x: x**2 + 2))

applymap() Method

The applymap() method applies a function to each individual element in a DataFrame. This is useful for element-wise transformations.

import pandas as pd
import numpy as np

df = pd.DataFrame([[1, 2.12], [3.356, 4.567]])
print("Original DataFrame:")
print(df)

# Format each value to two decimal places
print("\nFormatted DataFrame:")
print(df.applymap(lambda x: f"{x:.2f}"))

Comparison of Methods

MethodWorks onApplies toUse case
map()Series onlyEach elementValue substitution
apply()Series & DataFrameEach row/column (DataFrame) or element (Series)Column-wise or row-wise operations
applymap()DataFrame onlyEach elementElement-wise transformations