Tag: dataframe

Too many indices for array | python

I’m trying to copy arrays to a pandas Dataframe and get the error “too many indices for array”. The error occurs in these lines: pr_daily and rad_daily are numpy arrays of of the same length. Traceback (most recent call last): File “C:Users…Downloadspython_scriptspv.py”, line 277, in finalDataframe[f’pr_{a_id[index]}’] = (pr_daily[:,index]) IndexError: too many indices for array Answer This error is thrown when

How to merge two rows of a pandas dataframe depending on a condition in Python?

dataframe pandas python

I have a dataframe : Index 2 and 3 have the same product id’s, hence its the same order, so i am trying to combine the rows into one single row, to get : the final df being : I have tried using df.groupby function : But it throws datatype error Answer The output is:

How can I handle “Reindexing only valid with uniquely valued Index objects”

dataframe pandas python reindex unique

I have a dataframe of names (df): And a schedule dataframe (df1): I pivoted df1 (df1_pivoted) to put assignments in the df1 columns: I then try to add the names back in, but I can not figure out how to deal with the “Reindexing only valid with uniquely valued Index objects” error. I presume it is because some names are

How to Drop rows in DataFrame by conditions on column values

dataframe pandas python

I created code to drop some rows according to certain condition : but I got this error: AttributeError: ‘NoneType’ object has no attribute ‘head’ Answer Assign df_clean to df2 without the inplace Or leave the inplace without assigning, here df_clean will be your output

Fastest way to use if/else statements when looping through dataframe with pandas

conditional-statements dataframe loops pandas python

I am trying to run conditional statements when iterating through pandas df rows and it results with a very slow code. For example: The df is only about 40k rows long and it’s very slow, as this is only one of the statements I am trying to incorporate with this loop. Can you help with a faster way to do

Split Pandas Dataframe With Equal Amount of Rows for each Column Value

dataframe machine-learning pandas python

This is for a machine learning project. I have a CSV file which I have read in as a Pandas dataframe. The CSV looks like this: I have decreased the sample size and equalized the data, so that I have a dataframe with 60,000 rows; 30,000 rows with label 1 and label 0. I now want to split the dataframe

Python Dataframe find difference between datetime rows and convert to seconds

dataframe datetime python timedelta

I have a data frame with the DateTime index. I want to find the difference between row datetimes and convert it into seconds. My code: Present output: How do I convert the timedif column into total seconds? Answer Simply do: Example:

Change certain values in a dataframe column based on conditions on several columns

dataframe numpy pandas python where-clause

Let’s take this sample dataframe : I would like to replace the “B” values in Category by “B2” where there is a C or a D in Subcategory. I tried the following but I get the error “The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()” : I know that some similar questions are

Better (maybe more SQL-ish) way to populate pandas dataframe column from row and meta data than iterating over rows, please

dataframe pandas python

My data looks like this: Because I used a pandas.groupby() process to generate my metadata, it looks like this: Now, if my metadata looked like: I could easily write: I feel that there should be a different, pandas-oriented, way to directly use the metadata in the meta_df dataframe format that I have, and that it’ll probably be more efficient than

Mapping from a different dataframe

dataframe pandas python

I have a dataset of patients, e.g.: and a dataset of diseases of each patient (by ICD code): How can I flag each patient if he had history of a specific ICD code, desired output: I am currently doing it with iteration but this takes too long…. Answer If need indicators – it means only 0, 1 values use get_dummies: