here is the dataframe I’m currently working on : What I’d like to calculate is the average of the variable “avg_lag” weighted by “tot_SKU” in each product_basket for both SMB and CORP groups. This means that, taking CORP as an example, I want to calculate something as: (585,134 * 46.09 + 147,398 * 104.55 + … + 1,112,941 * 75.73)
Tag: dataframe
Rolling window calculation is added to the dataframe as a column of NaN
I have a data frame that is indexed from 1 to 100000 and I want to calculate the slope for every 12 steps. Is there any rolling window for that? I did the following, but it is not working. The ‘slope’ column is created, but all of the values as NaN. Answer It’s not necessary to use .groupby because there
How to Filter pandas dataframe [Error : list indices must be integers or slices, not str]
I have dataframe loaded in colab, my data look like this this is my code when I want to take some of the dataframe and put it into new dataframe I get this Error TypeError———–Traceback (most recent call last) in () 1 tm_df1 = pd.DataFrame() —-> 2 tm_df1 = tm_df1.append(tm_df[type(tm_df[‘parent_name_1’]) == ‘Apple’]) TypeError: list indices must be integers or slices,
comma seperation for each cell of dataframe pandas
If there are any cells with a comma (if condition), I would like to separate them out and pick the last one, something like: The original table is like here below: index x1 x2 0 banana orange 1 grapes, Citrus apples 2 tangerine, tangerine melons, pears which is going to be changed to like below: index x1 x2 0 banana
how to perform string formatting inside a dictionary?
lets say i have a payload which i am using to hit my API but i wanted to make its pg_no value as dynamic using for loop i.e. getting this error Answer first in your dictionary you are using same key from_data which is gonna be only last one present there. second main problem is causing by { bracket format
How to speed up successive pd.apply with successive pd.DataFrame.loc calls?
df has 10,000+ lines, so this code is taking a long time. In addition for each row, I’m doing a df_hist.loc call to get the value. I’m trying to speed up this section of code and then option I’ve found so far is using: But this forces me to use index based selection for row instead of value selection: which
Find unique column values out of two different Dataframes
How to find unique values of first column out of DF1 & DF2 DF1 DF2 Output This is how Read Answer TRY: NOTE : Replace 0 in subset= [0] with the first column name.
Converting string in a Pandas data frame to float
I have the following data frame: In order to calculate with the second column named “Marktwert”, I have to convert the string as a float, the sting has German format, that means the decimal point is a comma and the thousands separator is a dot. The number 217.803,37 has the datatype object. If I try to convert using the code
Rename column names through the loop (Python)
I have a table: I have the table like this: asd bsd tsd pzd … 20 15 10 5 … 20 15 10 5 … 20 15 10 5 … 20 15 10 5 … I want to rename all my column names with the pattern like this ‘param’+ (index_column +1) through the loop Desired output: param1 param2 param3 param4
Python Read Website Table Data into Dataframe
I came to know this source to import data. I tried but not successful in importing the data https://public.opendatasoft.com/explore/embed/dataset/us-zip-code-latitude-and-longitude/table/ my code: Presently I see no data but a string text. Table on the data: Answer JS is creating the table and rendering of javascript in a request does not work. a workaround can be: