Tag: pandas

Can Pandas output inferred schema for a CSV file?

csv data-science data-wrangling pandas python

Is there a method I can use to output the inferred schema on a large CSV using pandas? In addition, any way to have it tell me with that type if it is nullable/blank based off the CSV? File is about 500k rows with 250 columns. With my new job, I’m constantly being handed CSV files with zero format docum…

sum of row in the same columns in pandas

dataframe pandas python

i have a dataframe something like this how do i get the sum of values between the same column in a new column in a dataframe for example: i want a new column with the sum of d1[i] + d1[i+1] .i know .sum() in pandas but i cant do sum between the same column Answer Your question is not fully

How to create a function which Iterates over multiple lists

for-loop loops pandas python python-3.x

So I’m creating a series of column mappings, I can do this manually like so The function produces a mapping of a value and its column. Great, now I want to make this more general. Currently, if I needed to map 2 columns for example I run the following: Works as well but not ideal if I have a lot

Changing values of one column based on the other three columns in pandas dataframe

pandas python

I have a following Pandas dataframe, where I want to change a value of ‘fmc’ column based on ‘time’, ‘samples’ and ‘uid’ columns. Concept is as following: For the same date, if df.samples == ‘C’ & df.uid == ‘Plot1’, then corresponding…

Two DataFrames, find index of second one where values of two columns match up from first

dataframe indexing pandas python

I have two pandas DataFrames as pictured. DF1: DF2 (192 x 7): I want to find the index value of DF2 where df1[0] & df1[1] match df2[0] & df2[2]. For more detail, this would be represented above as starting at index 3188 of DF2. DF1 values will be dynamically changing as DF2 stays constant. Edit: Just …

Filter out dataframe based on values being within the 90th percentile

pandas python

Suppose I have this dataframe Now I want to go through each column and filter out the low percentiles keeping only values that are contained in the 90th percentile. Thus since apple and bob are each within their associated 90th percentiles I would have this dataframe How do I achieve this? Answer Hope this he…

iterating over folders executing a fuction at each 2 folders

pandas pathlib python python-3.x

I have a function called plot_ih_il that receives two data frames in order to generate a plot. I also have a set of folders that each contain a .h5 file with the data I need to give to the function plot_ih_il… I’m trying to feed the function two datasets at a time but unsuccessfully. I’ve be…

Importing multiple excel files with similar name, pivoting each excel file and then appending the results into a single file

dataframe numpy pandas pivot python

My problem statement is as above. Below is my progress so far I want to extract multiple excel files from the same location namely Test1 Test2 Test3…(I am using glob to do this) (DONE) 2. I want to iterate through the folder and find files starting with a string(DONE) 3. I then formed an empty dataframe…

get string from list if it’s contained in another string column

pandas python

I’ve a simple column of strings, and a list of strings. I need to create another column in which every row contains the string contained in the list if they are in the string_col, if it contains two or more strings from the list, then I’d like to have more rows. The result should be something like…

Title words in a column except certain words

pandas python python-3.x

How could I title all words except the ones in the list, keep? Expected Output: I tried Answer Here is one way of doing with str.replace and passing the replacement function: