1. df[df[col] > 0.5]
- Rows where 'col' is greater than 0.5
- Use boolean indexing to filter rows where a specific column (‘col’) meets a certain condition (greater than 0.5).
2. df[(df[col] > 0.5) & (df[col] < 0.7)]
- Rows where 0.5 < 'col' < 0.7
- Use logical operators (
&
for 'and') to filter rows based on multiple conditions for a specific column ('col').
3. df.sort_values('col1')
- Sorts by 'col1' in ascending order
- Use
sort_values()
to sort the DataFrame by a specific column ('col1') in ascending order by default.
4. df.sort_values('col2', ascending=False)
- Sorts by 'col2' in descending order
- Use
sort_values()
with theascending
parameter to sort the DataFrame by a specific column ('col2') in descending order.
5. df.sort_values(['col1', 'col2'], ascending=[True, False])
- Sorts by 'col1' ascending, 'col2' descending
- Use
sort_values()
with a list of columns and a corresponding list of sorting orders to sort the DataFrame by multiple columns with different sorting directions.
6. df.groupby('col')
- Group by values in one column
- Use
groupby()
to group rows based on unique values in a specific column ('col').
data = {'Category': ['A', 'B', 'A', 'B'], 'Value': [1, 2, 3, 4]}
df = pd.DataFrame(data)grouped = df.groupby('Category')# To access group 'A':
group_A = grouped.get_group('A')# Resulting DataFrame (group_A):
# Category Value
# 0 A 1
# 2 A 3
7. df.groupby(['col1', 'col2'])
- Group by values in multiple columns
- Use
groupby()
with a list of columns to group rows based on unique combinations of values in multiple columns ('col1' and 'col2').
data = {'Category': ['A', 'B', 'A', 'B'], 'Value': [1, 2, 3, 4]}
df = pd.DataFrame(data)grouped = df.groupby(['Category', 'Value'])# To access group ('A', 1):
group_A_1 = grouped.get_group(('A', 1))
# Resulting DataFrame (group_A_1):
# Category Value
# 0 A 1