I have a Pandas dataframe and a function that pulls entries from the dataframe. If the requested entry is not present in the dataframe—whether because the requested column does not exist, because the requested row/index does not exist, or both—I would like to return the string ‘entry not found’ in…
Tag: pandas
PANDAS & glob – Excel file format cannot be determined, you must specify an engine manually
I am not sure why I am getting this error although sometimes my code works fine! Excel file format cannot be determined, you must specify an engine manually. Here below is my code with steps: 1- list of columns of customers Id: 2- The code to find all xlsx files in a folder and read them: I added the engine
Python extract number between two special character in dataframe
I try to extract the number between the $ and white space in a column, then use the number to create a new column I look at many solutions on stackoverflow about Regular expression. it’s hard to understand my code doesn’t work are there any other solutions besides RegEx, if not, how to fix my code…
Replace text between two DataFrames in Pandas
I’m trying to replace/ablate terms within DataFrame if they appear within another DataFrame. For example, below is the replace DataFrame that includes an ablate column and a replace column. I’m looking to replace any word that appears within the ablate row with whatever is in the replace row. For …
How to convert rows to columns in a Pandas groupby?
I have a table containing price data for a set of products over 6 months. Each product has a unique id (sku_id) and can be from size 6-12. We measured the price each day, and generated a table similar to the example below. Source indicates what website the price was on (can be 1-4). Now, I want to perform som…
Set individual wedge hatching for pandas pie chart
I am trying to make pie charts where some of the wedges have hatching and some of them don’t, based on their content. The data consists of questions and yes/no/in progress answers, as shown below in the MWE. However, instead of greenyellow and gold I am trying to make the wedges green with yellow hatchi…
Misunderstanding of global variable in Python
I would like to calculate the variable “a” by using a function and the global variable “df”. My problem is that after running the function, “df” is also altered. I just want to calculate “a” with the function, but I want that “df” stays as it is. act…
How to get a date_range and insert them as a ‘list’ to a new column in dataframe?
I have a dataframe with 50k+ rows. df.head(5) is below: I need to create one more column with a list of months spent between start and finish dates to use df.explode relying on this column and get for every ID with months_used > 1 new row with date of every month the work was in progress. My primitive way …
Conditionally format cells in each column based on columns in another dataframe
I have a dataframe that contains threshold values for 20+ elements that’s formatted like so df1: Li Se Be Upper 30 40 10 Lower 10 5 1 I have another dataframe which contains values for those elements df2: Li Se Be Sample 1 50.8 100 20 Sample 2 -0.01 2 -1 If the values in df2 are greater than the
Python Pandas Mixed Type Warning – “dtype” preserves data?
I have this code that gives this warning: I have searched across both google and stackoverflow and people seem to give two kinds of solutions: low_memory = False converters Problem with #1 is it merely silences the warning but does not solve the underlying problem (correct me if I am wrong). Problem with #2 i…