Tag: pandas

Manually entered dataframes which were merged: possibility to sort by date

The below two dataframes df1 and df2 have been manually entered into Python. Then the dataframes were merged into df3. How can I make sure that the final merged dataframe df3 is using the same descending (chronological) order (as for the initial dataframes df1 and df2)(as it is not a case by default)? Thanks in advance PS this question is

Merging pandas get_dummies back to categorical values

dataframe pandas python

I have a pandas dataframe which I have one hot encoded with get_dummies, the data previously had a ‘type’ column which contained the values small_airport, large_airport, medium_airport, I split the type column in to each different type of airport with 1s and 0s where the frequencies matched. After using get_dummies, it looks a bit like this: Basically I need now

Multiply rows in pandas DataFrame depending on values from c

database pandas python rows

I would like to get from this: nname eemail email2 email3 email4 Stan stan@example.com NO stan1@example.com NO Danny danny@example.com danny1@example.com danny2@example.com danny3@example.com Elle elle@example.com NO NO NO To this: nname eemail Stan stan@example.com Stan stan1@example.com Danny danny@example.com Danny danny1@example.com Danny danny2@example.com Danny danny3@example.com Elle elle@example.com I know I can create 4 separate DFs with name and email column, then merge

I cannot change the values of a column with specific condition

dataframe pandas python

The table looks like the following: text dummy1 days op123ac 1 2000-01-01 op123ac 0 2000-01-04 op123ac 0 2000-01-07 op123ac 0 2000-01-10 op1248ab 0 2000-01-17 op1248ab 1 2000-01-20 op1248ab 1 2000-01-23 op1248ab 1 2000-01-26 Each unique “text” have four repeated values correspond to four unique “days”. “days” are consecutive for each “text”. The problem is that each “text” must have one

Add missing rows in pandas DataFrame

dataframe pandas pandas-groupby python

I have a DataFrame that looks like this: What I want to get is: In short, for each id, add the time rows missing with value 0. How do I do this? I wrote something with a loop, but it’s going to be prohibitively slow for my use case which has several million rows Answer Here’s one way using groupby.apply

How to remove brackets from multi-value keys when converting to dataframes or extend values of a key without extraneous characters

pandas python

The above code handles a nested dictionary to dataframe conversion perfectly fine but if you have a nested dictionary created with the .append() or .extend() method it adds extraneous brackets[] and quotes ” which is making downstream analysis difficult. For example for a nested dictionary like this: created with the setup: And converted to a dataframe with pd.dataframe.from_dict() Creates a

How to calculate the average R square of the company data [closed]

dataframe pandas python

Closed. This question needs details or clarity. It is not currently accepting answers. Want to improve this question? Add details and clarify the problem by editing this post. Closed 10 months ago. Improve this question STOCK RETURN I don’t know how to compute the average r squared with individual stock return and market return This is what my code looks

Pandas chart using plotly.graph_objects DO NOT ALLOW FOR ‘c’ or ‘color’ ATTRIBUTE

pandas plotly python python-3.7

I am adding line charts to an OHLV plotly, but I do not manage to color them. The above code fail when I use ‘color’ as a ‘figg.add_scatter’ attribute : Error given is: WHEN I do not specidy the color of the the scatter lines, the program works fine. Bellow is a data sample: I would like to pick the

DataFrame has two features how to add a row to split them

dataframe pandas python

I have a DataFrame that contains a column called feature that can have more than one of them as illustrated in the image below row 3 & 4. How do a add a row to the DataFrame that splits the two features: so for row 3 as an example having: and row 4: so the idea is to add a

How do I split a Pandas DataFrame into sub-arrays (specific use case outlined in detail)?

dataframe pandas python split

I apologize for the title, but I don’t know enough to properly condense my question into a single line. Here is the use case: I have a pd.DataFrame with arbitrary index values and a column, ‘timestamp’. I have an ordered List of timestamp values. I want to split the DataFrame into chunks with ‘timestamp’ values that are: less than List[0]