I have a large dataframe (>16M rows) which has a column named ‘user’. Every user have more than one occurrences. I want to add a new column ‘counter’ that increases every time a specific user has a new record. The dataframe looks like this: I want it to look like this with the new c…
Tag: pandas
Creating a new column for predicted cluster: SettingWithCopyWarning
This question will be a duplicate unfortunately, but I could not fix the issue in my code, even after looking at the other similar questions and their related answers. I need to split my dataset into train a test a dataset. However, it seems I am doing some error when I add a new column for predicting the clu…
module ‘networkx’ has no attribute ‘from_pandas_edgelist’
here is my code: and there is an error:AttributeError: module ‘networkx’ has no attribute ‘from_pandas_edgelist’* however, this the documents of networx we could find networkx has the attribute. here is the link of the documents:from_pandas_edgelist why did this question happen? Answer…
How to join two dataframe on different columns without using index
i have following 2 dataframes and i want to merge them. And i want like : can anyone help me , tried below code but i didnt get the solution Blockquote Answer You need reset_index first
How to perform row wise if and mathematical operations in pandas dataframe
Input Data: So basically want to check if first row amount (5000.00) is equal to second row amount; then perform a date difference function (13-02-2019 “-” 12-02-2019) and if difference is less than “5 days” then the following is the output. If difference is more than 5 days exclude fr…
How to calculate difference in DATE based on status of another column?
I have the following dataset, I want to create a new field name “duration” which will capture the time difference between SWO issued and rescinded for each BIN number. Note that, each BIN number can show up multiple times based on Date and different Unit. So, each unit can issue SWO on the same BI…
Pandas DataFrame mean of data in columns occurring before certain date time
I have a dataframe with ID’s of clients and their expenses for 2014-2018. What I want is to have the mean of the expenses per ID but only the years before a certain date can be taken into account when calculating the mean value (so column ‘Date’ dictates which columns can be taken into accou…
ValueError: Columns must be same length as key in pandas
i have df below I need to divide df[‘Cost’] / df[‘Reve’] Below is my code I got the error ValueError: Columns must be same length as key I got the error ValueError: Wrong number of items passed 2, placement implies 1 Answer Problem is duplicated columns names, verify: You can find this…
Generate BarGraph from DataFrame
So I have a generated a Disease_Data dataframe that has 2 columns, Location and Data (see below). I wanted to generate a bar graph like below: However, when I tried the code below, things did not work and gave an error: KeyError: ‘Location’ Please help, thank you Answer Just plot the dataframe
How can I get multiple dataframes returned from a class function?
So I have made a class that takes in a ticker symbol, and it returns a dataframe with all the price information for the dates specified. here is the code below: now this works perfectly, but ideally id like to pass in a list of symbols and have it return a seperate df for each symbol. so for example, symbols