I have the below dataframe: I want to create a new column based on this conditions: I have many different conditions in the original df but I’m trying to create a way just to add this conditions. I’m working with phyton in jupyter Answer Since the logic here is pretty complicated, I would suggest putting your conditions inside a function,
Tag: pandas
How can I use cumsum skipping the first entry?
I have a DF that contains the ids of several creators of certain projects and the outcomes of their projects over time. Each project can either be a success (outcome = 1) or a failure (outcome=0). The DF looks like this: I’m looking for a way to create two new columns: previous projects and previous successes. The first should be
Python DataFrame Filtering and Sorting at the Same Time
Hi I have a data frame with column as following: ‘founded’ and ‘company name’ What I’m trying to do is filtering the year founded > 0 and then sorting by company name, ascending. I’m looking for a code similar to this But I got this error and at the moment I have this code: Are there any way that I
Showing the proportions of values across each column in a DataFrame in Python
I have created the following DataFrame: Now I wish to show the proportion of each value (0,1,2) across each column. Ideally I’d like to represent this as a stacked bar chart – Column names on the x axis (so 8 bars in total from A to H), with the different colours on the bars representing the proportion of each value
Trying to pass user input in usecols Pandas
I’m trying to ask the user which columns do they want read in the dataframe from a csv file. I’ve been trying the following: But even this gives an error. Any suggestions? I think so I’m not able to understand the lambda function. The error I’m getting is: pandas.errors.EmptyDataError: No columns to parse from file Answer You can pass the
Pandas. How to sort a DataFrame without changing index?
Output: df2.sort_values([‘B’, ‘A’], ascending=[False, True]) gives: The column with indexes is now shuffled in new order, but I want it to be the same even after sorting. Parameter ignore_index just sets indexes from 0 to n-1. And the sort_index function isn’t helpful too, because indexes can be not in lexicographical order. Answer Use dataframe constructor: Output: Create new dataframe with
Sort dataframe by multiple columns while ignoring case
I want to sort a dataframe by multiple columns like this: However i found out that python first sorts the uppercase values and then the lowercase. I tried this: but i get this error: If i could, i would turn all columns to lowercase but i want them as they are. Any hints? Answer If check docs – DataFrame.sort_values for
Getting same result for different CSV files
DESCRIPTION: I have a piece of Python code, and this code takes a CSV file as input and produces a .player file as output. I’ve four different CSV files, hence, after running the code four times (taking each CSV file one by one), I’ve four .player files. REPOSITORY: https://github.com/divkrsh/gridlab-d DATA: The data in the CSV files are put through this
What’s a pythonic way (native function in pandas) to count occurrences of a certain value within cases (SPSS COUNT equivalent)?
I need to count occurrences of a certain value (let’s assume it’s 3) in a range of columns per each case. To do so I wrote a script as below: First print is: Second: Even though it works fine I am pretty sure there is a more pythonic way to do so. By ‘pythonic’ I mean using native, concise pandas
Occurence of a value in many lists
i have a Series Object in pandas with 2 columns, one for the indices and one with lists, I need to find if a value occurs in only one of these lists and return it with the most optimal way. As an example let’s say we have this i need to return 77 because it occurs in only one of