I have a data frame as shown below. In the first row I need to compare SPEC_TYP with max. In the 2nd row I need to compare SPEC_MAX with max.In the 3rd row SPEC_TYP with max and in some other cases I need to compare SPEC_MIN with min ,SPEC_MAX with max and so on. I searched in SO and google
Tag: pandas
Pandas DataFrame and grouping Pandas Series data into individual columns by value
I am hoping someone can help me optimize the following Python/Pandas code. My code works, but I know there must be a cleaner and faster way to perform the operation under consideration. I am looking for an optimized strategy because my use case will involve 16 unique ADC Types, as opposed to 4 in the example …
How to combine two entry values in a column
In the dataset, the column “Erf Size” has entries like 1 733 and 1 539 etc. Note that the Dtype of this “Erf Size” column is object. I would like to join these 1 733 and 1 539 into 1733 and 1539 etc. original dataset expected output Answer I think you can fix this with pd.to_numeric. T…
Sort DataFrame based on part of its index
What I would like to achieve I have a DataFrame whose indices are “ID (int) + underscore (_) + name (str)”. I would like to sort the data based on the ID. What I tested I tried to use sort_index and failed. https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.sort_index.html…
Formula that calculates year real variation in pandas?
I have a dataframe with this info: I need to find a formula that calculates, for each of the 4 months of 2023, the real variation of column A against the same months of 2022. For example, in the case of 2023-04, the calculation is x = 140 (value of 2022-04) * 1,66 (accumulated inflation from 2022-04 to 2023-0…
Cannot seem to pass pandas DataFrame into feature_engine.selection.DropHighPSIFeatures fit method correctly
I could not get the code to calculate psi values to work and I am not very familiar with feature_engine library or in general ML related operations. The code I am currently trying to run is: The error message returning is: The dataframe print statement in the previous code snippet is: So I assumed I don’…
Pandas Styler.to_latex() – how to pass commands and do simple editing
How do I pass the following commands into the latex environment? centering (I need landscape tables to be centered) and caption* (I need to skip for a panel the table numbering) In addition, I would need to add parentheses and asterisks to the t-statistics, meaning row-specific formatting on the dataframes. F…
Expand selected keys in a json pandas column
I have this sample dataset: And I want to ‘expand’ (or ‘explode’) each value in the json column, but only selecting some columns. This is the expected result: Firstly I tried using json_normalize and iterate over each row (even when the last row has no data), but I have to know before …
python SQLite3 how to getting records that match a list of values in a column then place that into pandas df
I am not experienced with SQL or SQLite3. I have a list of ids from another table. I want to use the list as a key in my query and get all records based on the list. I want the SQL query to feed directly into a DataFrame. I am getting a DatabaseError: Execution failed on sql ‘SELECT * FROM
python: custom pandas.DataFrame to dictionary function: some entries are lost
I want to read a .xlsx file, do some things with the data and convert it to a dict to save it in a .json file. To do that I use Python3 and pandas. This is the code: I add here the source of the .xlsx file (Spanish government). Select “Fichero con todas las provincias”. You have to delete the