Tag: pandas

pandas multiple conditions based on multiple columns

conditional-statements dataframe numpy pandas python

I am trying to color points of a pandas dataframe depending on TWO conditions. Example: I have tried so many different ways now and everything I found online was only depending on one condition. My example code always raises the Error: Here’s the code. Tried several variations without success. Btw: I understand, what it says but not how to handle

How to get rid of “Unnamed: 0” column in a pandas DataFrame read in from CSV file?

csv dataframe pandas python

I have a situation wherein sometimes when I read a csv from df I get an unwanted index-like column named unnamed:0. file.csv The CSV is read with this: This is very annoying! Does anyone have an idea on how to get rid of this? Answer It’s the index column, pass pd.to_csv(…, index=False) to not write out an unnamed index column

how can i write comments on some cells of excel sheet using pandas

pandas python

I didn’t find anything that enable me to write comments on some specific cell while writing excel sheet using panadas.to_excel . Any help is appreciated. Answer After searching for some time, I think the best way to handle comments or other such properties like color and size of text at cell or sheet level is to use XlsxWriter with pandas.

Call column in dataframe by column index instead of column name – pandas

dataframe pandas python

How can I call column in my code using its index in dataframe instead of its name. For example I have dataframe df with columns a, b, c Instead of calling df[‘a’], can I call it using its column index like df[1]? Answer You can use iloc: Example:

Replace values in one dataframe with values in second dataframe in Python

dataframe pandas python python-3.x

I have a large dataframe (DF1) that contains a variable containing UK postcode data. Inevitably there are some typos in the data. However, after some work with regular expressions, I have created a second database that contains corrected versions of the postcode data (but only for those rows where the original postcode was incorrect) – DF2. (N.B. the index values

Pandas row to json

json pandas python

I have a dataframe in pandas and my goal is to write each row of the dataframe as a new json file. I’m a bit stuck right now. My intuition was to iterate over the rows of the dataframe (using df.iterrows) and use json.dumps to dump the file but to no avail. Any thoughts? Answer Pandas DataFrames have a to_json

How to plot multiple linear regressions in the same figure

matplotlib pandas plot python seaborn

Given the following: This will create 2 separate plots. How can I add the data from df2 onto the SAME graph? All the seaborn examples I have found online seem to focus on how you can create adjacent graphs (say, via the ‘hue’ and ‘col_wrap’ options). Also, I prefer not to use the dataset examples where an additional column might

In Pandas, whats the equivalent of ‘nrows’ from read_csv() to be used in read_excel()?

pandas python

Want to import only certain range of data from an excel spreadsheet (.xlsm format as it has macros) into a pandas dataframe. Was doing it this way: But it seems that nrows works only with read_csv() ? What would be the equivalent for read_excel()? Answer If you know the number of rows in your Excel sheet, you can use the

How to use sklearn fit_transform with pandas and return dataframe instead of numpy array?

numpy pandas python scikit-learn

I want to apply scaling (using StandardScaler() from sklearn.preprocessing) to a pandas dataframe. The following code returns a numpy array, so I lose all the column names and indeces. This is not what I want. A “solution” I found online is: It appears to work, but leads to a deprecationwarning: /usr/lib/python3.5/site-packages/sklearn/preprocessing/data.py:583: DeprecationWarning: Passing 1d arrays as data is deprecated in

Python fast DataFrame concatenation

pandas python

I wrote a code to concatenate parts of a DataFrame to the same DataFrame as to normalize the occurrence of rows as per a certain column. and this is unbelievably slow. Is there a way to fast concatenate DataFrame without creating copies of it? Answer There are a couple of things that stand out. To begin with, the loop is