How do I sort a pandas DataFrame by one column?
I have a pandas DataFrame with three columns:
0 1 2
0 354.7 April 4.0
1 55.4 August 8.0
2 176.5 December 12.0
3 95.5 February 2.0
4 85.6 January 1.0
5 152 July 7.0
6 238.7 June 6.0
7 104.8 March 3.0
8 283.5 May 5.0
9 278.8 November 11.0
10 249.6 October 10.0
11 212.7 September 9.0
Since the months in column 1 are not in calendar order, I created a second column with the corresponding month numbers (1-12). How can I use pandas sort by column
to reorder the DataFrame based on this numerical column?
Sorting a DataFrame is pretty straightforward in pandas. You can use sort_values()
to sort based on a specific column. In this case, since column 2
represents the numerical month order, you can sort the DataFrame like this:
df = df.sort_values(by=2) # Sorting by the month number column
This will reorder the DataFrame so that January (1) comes first and December (12) comes last. The pandas sort by column
method makes this super easy!
If you want to modify the DataFrame directly without reassigning it to a new variable, you can use the inplace=True
parameter. This updates the DataFrame in place instead of returning a new one:
df.sort_values(by=2, inplace=True)
This is especially useful when you don’t need to keep the original order. Just a small tweak, but it makes working with pandas sort by column
even more efficient!
Sometimes, sorting by a single column isn’t enough—what if two rows have the same month number? In that case, you can add a secondary sorting criterion. For example, if you want to break ties by sorting based on column 0
as well, you can do this:
df = df.sort_values(by=[2, 0]) # Sort by month number, then by the first column
This ensures that if two months are the same, the sorting will fall back to column 0
for ordering. A small but powerful trick in pandas sort by column
when dealing with more complex data!