here is the dataframe I’m currently working on : What I’d like to calculate is the average of the variable “avg_lag” weighted by “tot_SKU” in each product_basket for both SMB and CORP groups. This means that, taking CORP as an example, I want to calculate something as: (585…
Tag: dataframe
Rolling window calculation is added to the dataframe as a column of NaN
I have a data frame that is indexed from 1 to 100000 and I want to calculate the slope for every 12 steps. Is there any rolling window for that? I did the following, but it is not working. The ‘slope’ column is created, but all of the values as NaN. Answer It’s not necessary to use .groupby …
How to Filter pandas dataframe [Error : list indices must be integers or slices, not str]
I have dataframe loaded in colab, my data look like this this is my code when I want to take some of the dataframe and put it into new dataframe I get this Error TypeError———–Traceback (most recent call last) in () 1 tm_df1 = pd.DataFrame() —-> 2 tm_df1 = tm_df1.append(tm_df[t…
comma seperation for each cell of dataframe pandas
If there are any cells with a comma (if condition), I would like to separate them out and pick the last one, something like: The original table is like here below: index x1 x2 0 banana orange 1 grapes, Citrus apples 2 tangerine, tangerine melons, pears which is going to be changed to like below: index x1 x2 0…
how to perform string formatting inside a dictionary?
lets say i have a payload which i am using to hit my API but i wanted to make its pg_no value as dynamic using for loop i.e. getting this error Answer first in your dictionary you are using same key from_data which is gonna be only last one present there. second main problem is causing by { bracket format
How to speed up successive pd.apply with successive pd.DataFrame.loc calls?
df has 10,000+ lines, so this code is taking a long time. In addition for each row, I’m doing a df_hist.loc call to get the value. I’m trying to speed up this section of code and then option I’ve found so far is using: But this forces me to use index based selection for row instead of value …
Find unique column values out of two different Dataframes
How to find unique values of first column out of DF1 & DF2 DF1 DF2 Output This is how Read Answer TRY: NOTE : Replace 0 in subset= [0] with the first column name.
Converting string in a Pandas data frame to float
I have the following data frame: In order to calculate with the second column named “Marktwert”, I have to convert the string as a float, the sting has German format, that means the decimal point is a comma and the thousands separator is a dot. The number 217.803,37 has the datatype object. If I t…
Rename column names through the loop (Python)
I have a table: I have the table like this: asd bsd tsd pzd … 20 15 10 5 … 20 15 10 5 … 20 15 10 5 … 20 15 10 5 … I want to rename all my column names with the pattern like this ‘param’+ (index_column +1) through the loop Desired output: param1 param2 param3 param4
Python Read Website Table Data into Dataframe
I came to know this source to import data. I tried but not successful in importing the data https://public.opendatasoft.com/explore/embed/dataset/us-zip-code-latitude-and-longitude/table/ my code: Presently I see no data but a string text. Table on the data: Answer JS is creating the table and rendering of ja…