Skip to content
Advertisement

Tag: pandas

Group by and find top n value_counts pandas

I have a dataframe of taxi data with two columns that looks like this: Basically, each row represents a taxi pickup in that neighborhood in that borough. Now, I want to find the top 5 neighborhoods in each borough with the most number of pickups. I tried this: Which gives me something like this: How do I filter it so

move column in pandas dataframe

I have the following dataframe: How can I move columns b and x such that they are the last 2 columns in the dataframe? I would like to specify b and x by name, but not the other columns. Answer You can rearrange columns directly by specifying their order: In the case of larger dataframes where the column titles are

Python pandas calculate rolling stock beta using rolling apply to groupby object in vectorized fashion

I have a large data frame, df, containing 4 columns: etc. I am attempting to calculate a common financial measure, known as beta, using a function, that takes two of the columns, ret_1m, the monthly stock_return, and ret_1m_mkt, the market 1 month return for the same period (period_id). I want to apply a function (calc_beta) to calculate the 12-month result

Converting pandas.DataFrame to bytes

I need convert the data stored in a pandas.DataFrame into a byte string where each column can have a separate data type (integer or floating point). Here is a simple set of data: and df looks something like this: The DataFrame knows about the types of each column df.dtypes so I’d like to do something like this: This typically works

Divide multiple columns by another column in pandas

I need to divide all but the first columns in a DataFrame by the first column. Here’s what I’m doing, but I wonder if this isn’t the “right” pandas way: Is there a way to do something like df[[‘B’,’C’]] / df[‘A’]? (That just gives a 10×12 dataframe of nan.) Also, after reading some similar questions on SO, I tried df[‘A’].div(df[[‘B’,

pandas to_sql all columns as nvarchar

I have a pandas dataframe that is dynamically created with columns names that vary. I’m trying to push them to sql, but don’t want them to go to mssqlserver as the default datatype “text” (can anyone explain why this is the default? Wouldn’t it make sense to use a more common datatype?) Does anyone know how I can specify a

How to add header row to a pandas DataFrame

I am reading a csv file into pandas. This csv file consists of four columns and some rows, but does not have a header row, which I want to add. I have been trying the following: But when I apply the code, I get the following Error: What exactly does the error mean? And what would be a clean way

Advertisement