Tag: pandas

Create a new column based on different columns

I have the below dataframe: I want to create a new column based on this conditions: I have many different conditions in the original df but I’m trying to create a way just to add this conditions. I’m working with phyton in jupyter Answer Since the logic here is pretty complicated, I would suggest putting your conditions inside a function,

How can I use cumsum skipping the first entry?

cumsum pandas python

I have a DF that contains the ids of several creators of certain projects and the outcomes of their projects over time. Each project can either be a success (outcome = 1) or a failure (outcome=0). The DF looks like this: I’m looking for a way to create two new columns: previous projects and previous successes. The first should be

Python DataFrame Filtering and Sorting at the Same Time

dataframe pandas python

Hi I have a data frame with column as following: ‘founded’ and ‘company name’ What I’m trying to do is filtering the year founded > 0 and then sorting by company name, ascending. I’m looking for a code similar to this But I got this error and at the moment I have this code: Are there any way that I

Showing the proportions of values across each column in a DataFrame in Python

pandas plot python

I have created the following DataFrame: Now I wish to show the proportion of each value (0,1,2) across each column. Ideally I’d like to represent this as a stacked bar chart – Column names on the x axis (so 8 bars in total from A to H), with the different colours on the bars representing the proportion of each value

Trying to pass user input in usecols Pandas

dataframe pandas python

I’m trying to ask the user which columns do they want read in the dataframe from a csv file. I’ve been trying the following: But even this gives an error. Any suggestions? I think so I’m not able to understand the lambda function. The error I’m getting is: pandas.errors.EmptyDataError: No columns to parse from file Answer You can pass the

Pandas. How to sort a DataFrame without changing index?

pandas python

Output: df2.sort_values([‘B’, ‘A’], ascending=[False, True]) gives: The column with indexes is now shuffled in new order, but I want it to be the same even after sorting. Parameter ignore_index just sets indexes from 0 to n-1. And the sort_index function isn’t helpful too, because indexes can be not in lexicographical order. Answer Use dataframe constructor: Output: Create new dataframe with

Sort dataframe by multiple columns while ignoring case

case-insensitive dataframe pandas python sorting

I want to sort a dataframe by multiple columns like this: However i found out that python first sorts the uppercase values and then the lowercase. I tried this: but i get this error: If i could, i would turn all columns to lowercase but i want them as they are. Any hints? Answer If check docs – DataFrame.sort_values for

Getting same result for different CSV files

csv numpy pandas python python-3.x

DESCRIPTION: I have a piece of Python code, and this code takes a CSV file as input and produces a .player file as output. I’ve four different CSV files, hence, after running the code four times (taking each CSV file one by one), I’ve four .player files. REPOSITORY: https://github.com/divkrsh/gridlab-d DATA: The data in the CSV files are put through this

What’s a pythonic way (native function in pandas) to count occurrences of a certain value within cases (SPSS COUNT equivalent)?

pandas python python-3.x spss

I need to count occurrences of a certain value (let’s assume it’s 3) in a range of columns per each case. To do so I wrote a script as below: First print is: Second: Even though it works fine I am pretty sure there is a more pythonic way to do so. By ‘pythonic’ I mean using native, concise pandas