Hello everyone, I’m trying to develop a GUI to modify and make computation on Pandas DataFrames with the PyQt5 module. I could actually display my DataFrame, and Edit specific column or not. It’s displayed in a QTableWidget. I tried to implement a QItemDelagate with the QDoubleValidator to write only specifics numbers in cols. This is my function : I can
Tag: dataframe
How to split data in a column into some separate columns in Python?
So, I have a data frame given below: I want to have the results in the og dataframe with some single line strings separately, such as [107.625764, -6.910353], [107.625871, -6.910358], split to 107.625764, -6.910353 . The detail of expected results are in the picture below. Expected Results All I know that we can apply str.split method with specifying any specific
PANDAS & glob – Excel file format cannot be determined, you must specify an engine manually
I am not sure why I am getting this error although sometimes my code works fine! Excel file format cannot be determined, you must specify an engine manually. Here below is my code with steps: 1- list of columns of customers Id: 2- The code to find all xlsx files in a folder and read them: I added the engine
Python extract number between two special character in dataframe
I try to extract the number between the $ and white space in a column, then use the number to create a new column I look at many solutions on stackoverflow about Regular expression. it’s hard to understand my code doesn’t work are there any other solutions besides RegEx, if not, how to fix my code? Answer Escape the $:
How to convert rows to columns in a Pandas groupby?
I have a table containing price data for a set of products over 6 months. Each product has a unique id (sku_id) and can be from size 6-12. We measured the price each day, and generated a table similar to the example below. Source indicates what website the price was on (can be 1-4). Now, I want to perform some
Python Pandas Mixed Type Warning – “dtype” preserves data?
I have this code that gives this warning: I have searched across both google and stackoverflow and people seem to give two kinds of solutions: low_memory = False converters Problem with #1 is it merely silences the warning but does not solve the underlying problem (correct me if I am wrong). Problem with #2 is converters might do things we
How can I turn off rounding in Spark?
I have a dataframe and I’m doing this: I want to get just the first four numbers after the dot, without rounding. When I cast to DecimalType, with .cast(DataTypes.createDecimalType(20,4) or even with round function, this number is rounded to 0.4220. The only way that I found without rounding is applying the function format_number(), but this function gives me a string,
Pyspark get top two values in column from a group based on ordering
I am trying to get the first two counts that appear in this list, by the earliest log_date they appeared. In this case my expected output is: This is what I have working but there are a few edge cases where count could go down and then back up, shown in the example above. This code returns 2021-07-11 as the
Pandas groupby and count across multiple columns
I have data ordered by ID, Year, and then a series of event flags indicating whether a thing did or did not happen for that ID in that year: ID Year x y z 1 2015 0 1 0 1 2016 1 1 0 1 2017 0 1 1 2 2015 1 0 1 2 2016 1 1 0 2
Expand Pandas Dataframes adding rows by different ranges
I have a dataframe like this: SEG FAM GAMA MIN_RAT MAX_RAT VALOR PE 001 002 1 2 5,15 PE 001 002 2,1 3 2,55 And I need to “expand” the df adding new rows to make a new dataframe like this: SEG FAM GAMA MIN_RAT MAX_RAT VALOR PE 001 002 1 1 10,30 PE 001 002 1,1 1,1 9,79 PE