I have a pandas dataframe that has a column like this : I want to make a condition on the whole dataframe based on the id value. I did many attempts but failed. it says key error, it cannot access ‘id’ which is inside the column ‘platform’. Any help is welcome, and thank you in advance…
Tag: dataframe
I’m trying to use multiple nested np.where to create a column of a data frame in python ,facing error on the same
Consider a data frame with 97 rows and 44 columns where i have three columns whose names are “Bostwick”,”mu_yield” , so i’m trying to create a new column called “Target” where if the “Bostwick” column values lie between “5.00 and 6.75” else if …
How to count word similarity between two pandas dataframe
Here’s my first dataframe df1 Here’s my second dataframe df2 Similarity Matrix, columns is Id from df1, rows is Id from df2 Note: 0 value in (1,1) and (3,2) because no text similar 1 value in (3,1) is because of Bersatu and Kita’ (Id 1ondf2is avalilable in Id3ondf1` 0.33 is counted because o…
Python Pivoting dataframe that has mulitple ID columns
from a database I get the following table into a python dataframe df: FunctionID FunctionText FunctionModule UserGroup 1 Fct1 ModX GroupA 2 Fct2 ModX GroupA 2 Fct2 ModX GroupB 3 Fct3 ModY GroupB 3 Fct3 ModY GroupC . … … … 3000 Fct3000 ModZ GroupF My goal is to get a pivot-like table that loo…
Convert JSON Dict to Pandas Dataframe
I have what appears to be a very simple JSON dict I need to convert into a Pandas dataframe. The dict is being pulled in for me as a string which I have little control over. I have tried the usual methods such as pd.read_json() and json_normalize() etc but can’t seem to get it anywhere close. Has anyone…
Python pandas printing correct dataframe
I am reading from a csv-file, and have the current values in my dataframe, where width and height is min and max value. And now i want to split and format the columns and print them: My problem is that it stills print: Whereas I want it to print: What am I doing wrong? Answer This code can help you
How to use df groupby to return counts on specific values in column across each month
I have a dataframe made up of dummy car purchases across a year which looks like: df = What I’m looking for is to get an aggregated count of each brand of car for each month in 2021, so it would look like this: df = So far I’ve used this code to group by the year, month but I
Pandas append does not work (dataframe is not getting bigger)
I am currently trying to write a code that is supposed to add mulitple dataframes into one, using the append method. However, with the code I currently use, it seems that only the first dataframe is read. I have tried locating the problem by adding a len(df) to my code and it seems to that the merged datafram…
How to add multiple columns to a dataframe based on calculations
I have a csv dataset (with > 8m rows) that I load into a dataframe. The csv has columns like: I am able to load the dataset into my dataframe, but then I need to add multiple calculated columns to the dataframe for each row. In otherwords, unlike this SO question, I do not want the rows of the new
Accessing and overwriting Multiindex df data
I’m trying to multiply all the values of the following multiindex df for which the first multiindex equals Property_2 with a scalar: I’ve tried various ways: but I am getting back nan’s in the relevant places. Answer That’s because the indices don’t match. One way to get around t…