I have a dataframe like as shown below I would like to select rows based on the criteria below criteria 1 – pick all rows where source-system = I criteria 2 – pick prior row (n-1) only when source-system of (n-1)th is O and diff is zero. This criteria 2 should be applied only when nth row has sour…
Tag: pandas
how to divide revenue between check_in_date and check_out_date, and turn those dates into single column named date
I have an example of my dataset like this : and I want to turn it into something like this : The check_out date is not included in the range; so the first period is 2 days (27 and 28) with 50 revenue each. Answer Another method to solve this is first get difference between the out and in dates
Replace with Python regex in pandas column
There are many values starting with ‘^f’ and ending with ‘^’ in a pandas column. And I need to replace them like below : Answer You don’t mention what you’ve tried already, nor what the rest of your DataFrame looks like but here is a minimal example: Output
Match everything except a complex regex pattern and replace it in Pandas
I have a complex regex pattern to match mixed dates for a csv column in pandas df. I would like to replace everything except the regex pattern match with “” . I have tried pretty much all the negation cases (^ ?! and others). But I keep replacing the regex match with “” (empty string).…
How do I display the top & bottom 5 row values along with custom row values in Matplotlib?
I have a dataframe that’s about 750 odd rows, which is obviously excessive for a traditional bar chart. What I would like to do is have it display the Top 5 entries, the specific criteria that I’m looking for, and the bottom 5 entries. The dataframe is something like this: What I want to be able t…
Add a object one level higher in pandas json
I’m new to json and pandas and want to output my data in the following schema, but I’m not sure how to add the leading ‘results’. My dataframe: My code: My Json output: My Json schema that I want: Answer Try this:
How to export pandas dataframe with Multi-index columns to Excel with column name in one level unmerged and column name in another level merged?
I have a pandas dataframe df which looks as follows: The columns are Multi-Index consisting of 3 levels. First level has Germany as country. Second level has some indicators, and third level has years. And there are some data in the pandas dataframe. I’d like to export this dataframe to Excel such that …
python with pandas to parse dates like “0001-11-29 13:00:00 BC”
I am trying to read some sql data using pandas library and one of the column “customer_date” has values like “0001-11-29 13:00:00 BC”. My query fails with error ValueError: year 0 is out of range Please suggest a way to parse such date/timestamps. Here is my code. Error: Answer This is…
How do I reverse a cumulative count from a specific point based on a condition and then resume the count in a pandas data frame?
I am trying to count the number of days between dates (cumulatively), (grouped by a column denoted as id), however, I want to reset the counter whenever a condition is satisfied. I want to at the same time create a new column and add the values to that column for those particular rows. Additionally, I want to…
Count the total number of multiple distinct occurrences in the same data frame
Suppose we have the data frame df I know that to count the number of ‘B’ I have to use (df == ‘B’).sum().sum(). Now suppose that I want to count how many elements contained in the list v = [‘B’, ‘C’] there are in the data frame. What could be a way of doing this…