MLP FU
Pandas

Basic Operations in Pandas (DataFrames)

1. Viewing Data

# First 5 rows
df.head()

# Last 3 rows  
df.tail(3)

# Column names
df.columns

# Basic statistics
df.describe()

2. Selecting Data

# Single column (returns Series)
df['Name']

# Multiple columns (returns DataFrame)
df[['Name', 'Temperature']]

# Rows by index
df.iloc[0]  # First row
df.iloc[1:3]  # Rows 1-2

# Rows by condition
df[df['Temperature'] > 98]

3. Adding/Modifying Data

# Add new column
df['Age'] = [25, 30, 28]

# Modify column
df['Temperature'] = df['Temperature'] + 1

# Rename columns
df.rename(columns={'Temperature': 'Temp'}, inplace=True)

4. Filtering Data

# Single condition
healthy = df[df['Sick'] == False]

# Multiple conditions
fever = df[(df['Temp'] > 98.5) & (df['Age'] < 30)]

5. Sorting

# Sort by column
df.sort_values('Temp', ascending=False)

# Sort by multiple columns
df.sort_values(['Sick', 'Temp'], ascending=[True, False])

6. Grouping Data

# Group and calculate
df.groupby('Sick')['Temp'].mean()

# Multiple aggregations
df.groupby('Sick').agg({'Temp': ['mean', 'max'], 'Age': 'count'})

7. Handling Missing Data

# Check for missing values
df.isnull()

# Drop rows with missing values
df.dropna()

# Fill missing values
df.fillna(0)

Key Tips:

  • Most operations return new DataFrames (use inplace=True to modify original)
  • Chaining works: df.sort_values().head()
  • Use .copy() when you need to preserve original data