Pandas
Basic Operations in Pandas (DataFrames)
1. Viewing Data
# First 5 rows
df.head()
# Last 3 rows
df.tail(3)
# Column names
df.columns
# Basic statistics
df.describe()2. Selecting Data
# Single column (returns Series)
df['Name']
# Multiple columns (returns DataFrame)
df[['Name', 'Temperature']]
# Rows by index
df.iloc[0] # First row
df.iloc[1:3] # Rows 1-2
# Rows by condition
df[df['Temperature'] > 98]3. Adding/Modifying Data
# Add new column
df['Age'] = [25, 30, 28]
# Modify column
df['Temperature'] = df['Temperature'] + 1
# Rename columns
df.rename(columns={'Temperature': 'Temp'}, inplace=True)4. Filtering Data
# Single condition
healthy = df[df['Sick'] == False]
# Multiple conditions
fever = df[(df['Temp'] > 98.5) & (df['Age'] < 30)]5. Sorting
# Sort by column
df.sort_values('Temp', ascending=False)
# Sort by multiple columns
df.sort_values(['Sick', 'Temp'], ascending=[True, False])6. Grouping Data
# Group and calculate
df.groupby('Sick')['Temp'].mean()
# Multiple aggregations
df.groupby('Sick').agg({'Temp': ['mean', 'max'], 'Age': 'count'})7. Handling Missing Data
# Check for missing values
df.isnull()
# Drop rows with missing values
df.dropna()
# Fill missing values
df.fillna(0)Key Tips:
- Most operations return new DataFrames (use
inplace=Trueto modify original) - Chaining works:
df.sort_values().head() - Use
.copy()when you need to preserve original data