I have a method that creates a list, results, and then appends to it a dict row which has keys (column names in my eventual dataframe) and values (row values in my eventual dataframe). Then I append the row and convert the results collection to a df at the end. For example:
results = [] row = {} row['Name']="alice" row['Age']=3 results.append(row) ..reset row, fill in values like above, and append more rows.. df = pd.DataFrame(results)
The issue I have is that the column names are alphbetized so the df looks like this:
df.head() |Age | Name | |3 | alice |
Is there a way to specify the column names to be in the order I want (“Name”, “Col”)? In reality I have many more columns so I’d prefer not to have to do:
cols = df.columns cols = cols[:1] + cols[0:1]
and manually rearrange it. However, if I do so, does that just move around the column row or also the rows below? So, will “alice” and 3 in the row below also get moved around while moving the column as one would expect?
Advertisement
Answer
Specify a list of your desired ordering to the columns
attribute, and DataFrame
will reorder the columns when creating the DataFrame.
pd.DataFrame(results, columns=['Name', 'Age']) Name Age 0 alice 3
Or, DataFrame.from_records
does this as well.
pd.DataFrame.from_records(results, columns=['Name', 'Age']) Name Age 0 alice 3
If you’re working with multiple columns, you could always choose to do columns=['Name', 'Age', *(row.keys() - {'Name', 'Age'})]
assuming you don’t care about the ordering of the remaining columns.
I’ve written about constructing DataFrames from records in this post: Convert list of dictionaries to a pandas DataFrame
Another idea, fix the columns if the ordering is incorrect.
if df.columns.get_loc('Age') < df.columns.get_loc('Name'): df.insert(df.columns.get_loc('Age'), 'Name', df.pop('Name'))
This will insert Name before Age if it is originally inserted after.