BackgroundI have a complex nested JSON object, which I am trying to unpack into a pandas df in a very specific way. JSON Objectthis is an extract, containing randomized data of the JSON object, which shows examples of the hierarchy (inc. children) for 1x family (i.e. ‘Falconer Family’), however there is 100s of them in total and this extract just
Tag: pandas
Add features to the “numeric” dataset whose categorical value must be mapped using a conversion formula
I have this dataset: This is the request: “Add the Mjob and Fjob attributes to the “numeric” dataset whose categorical value must be mapped using a conversion formula of your choice.” Does anyone knows how to do it? For example: if ‘at_home’ value become ‘1’ in Mjob, I want the same result in the Fjob column. Same categorical values must
df.to_dict make duplicated index (pandas) as primary key in a nested dict
I have this data frame which I’d like to convert to a dict in python, I have many other categories, but showed just two for simplicity I want the output to be like this Answer You can do this without assigning an additional column or aggregating using list: I created a separate function for readability – you could, of course,
Replacing NaN values in timeseries Pandas dataframe with mean values
I have a dataframe that has 2 columns, date and values. I want to replace NaN values in the dataframe with mean values, but with specific condition. NaN values should be replaced with mean value of the values from the same period for the year that has that value (+/- 1 day). Value for 2021-02-04 should be: Because dates “2022-02-03”,
Most efficient way to check cells and change neighbors matching a condition in a dataframe
I’m using a pandas dataframe to store a dynamic 2D game map for a rougelike style game map editor. The player can draw and erase rooms. I need to draw walls around these changing rooms. I have this: And need this: What is the most efficient way to do this? So far I followed the approach outlined here, but this
Find a substring in cells across multiple columns in a Pandas dataframe
I have a large DataFrame with 50+ columns which I’m simplifying here below: I’m trying to find a) whether there are any instances of ‘—>’ in any of the cells across the DataFrame? b) if so where? (optional) So far I’ve tried 2 approaches this only works for strings not substrings I get: (I believe this may only work for
converting a key value text file into a CSV file
I have a text file that needs to be converted into CSV file using pandas. A piece of it is presented in the following: Rows are cod,10, and cod,18 and the columns are 1, 2, 3,…, 15. Any idea? Regards, Ali Answer I use pandas to deal with the conversion, but vanilla Python to deal with some of aspects of
create an int column after dividing a column by a number in pandas
Assume that I have a panda data frame with a column that holds seconds. I want to create a new column that holds minutes. so I divide the sec column by 60. The problem that I have is that min column is not an integer anymore. How can I make it an integer? I have this code: I tried this,
create new column on conditions python
I have a ref table df_ref like this: I need to create a new column in another table based on ref table.The table like this: The output table df_org looks like: If any column value in col1 and col2 can find in ref table, it will use the ref col in ref table. If col1 and col2 are NULL, So
How to split words into different columns in dataframe?
I am new to coding , recently started learning to code. Currently I am stuck in the process to split a column. Please help me I have this dataframe and I want to split it into Really appreciate for taking your time and answering to my problem. PS: this is just an example of option symbol which is combination of