Python: Sorting, Filtering, and Grouping in Pandas

Rahul S
3 min readSep 8

1. df[df[col] > 0.5] - Rows where 'col' is greater than 0.5

  • Use boolean indexing to filter rows where a specific column (‘col’) meets a certain condition (greater than 0.5).

2. df[(df[col] > 0.5) & (df[col] < 0.7)] - Rows where 0.5 < 'col' < 0.7

  • Use logical operators (& for 'and') to filter rows based on multiple conditions for a specific column ('col').

3. df.sort_values('col1') - Sorts by 'col1' in ascending order

  • Use sort_values() to sort the DataFrame by a specific column ('col1') in ascending order by default.

4. df.sort_values('col2', ascending=False) - Sorts by 'col2' in descending order

  • Use sort_values() with the ascending parameter to sort the DataFrame by a specific column ('col2') in descending order.

5. df.sort_values(['col1', 'col2'], ascending=[True, False]) - Sorts by 'col1' ascending, 'col2' descending

  • Use sort_values() with a list of columns and a corresponding list of sorting orders to sort the DataFrame by multiple columns with different sorting directions.

6. df.groupby('col') - Group by values in one column

  • Use groupby() to group rows based on unique values in a specific column ('col').
data = {'Category': ['A', 'B', 'A', 'B'], 'Value': [1, 2, 3, 4]}
df = pd.DataFrame(data)
grouped = df.groupby('Category')# To access group 'A':
group_A = grouped.get_group('A')
# Resulting DataFrame (group_A):
# Category Value
# 0 A 1
# 2 A 3

7. df.groupby(['col1', 'col2']) - Group by values in multiple columns

  • Use groupby() with a list of columns to group rows based on unique combinations of values in multiple columns ('col1' and 'col2').
data = {'Category': ['A', 'B', 'A', 'B'], 'Value': [1, 2, 3, 4]}
df = pd.DataFrame(data)
grouped = df.groupby(['Category', 'Value'])# To access group ('A', 1):
group_A_1 = grouped.get_group(('A', 1))
# Resulting DataFrame (group_A_1):
# Category Value
# 0 A 1
Rahul S

I learn as I write | LLM, NLP, Statistics, ML