I have a grouped data frame named df_grouped where AF & Local are the indexes. I would like to assert whether the indexes in df_grouped are equal to a column from another dataframe df[A]. This is an example of my code I tried this but it does not work: Answer To use assert for pandas series you can use as…
Tag: pandas
How to improve the computation speed of subsetting a pandas dataframe?
I have a large df (14*1’000’000) and I want to subset it. The calculation seems to take unsurprisingly a lot of time though and I wonder how to improve the speed. What I want is to subset for each Name the lowest value of Total_time while ignoring zero values and picking only the first one if ther…
How to append a dictionary with multiple keys to a dataframe
I am trying to append a dictionary to my DataFrame. This is how the DataFrame looks: And this is the dictionary: I need to append this dictionary to my df. I know how to do it when I append one column (using map) but this is not going to work here. Any ideas on how to append this? Answer You
Modify HTML with BeautifulSoup using data from Pandas table
My understanding is that BeautifulSoup is more for getting data rather than modifying, though it can perform that. I have a skeleton HTML tree called ‘tree’, and want to insert data from a database query to modify the HTML. The amount of data inserted is variable. I’m aware of the method Bea…
How would I find the quarterly averages of these monthly figures?
My dataset is similar to the below: How can I add columns to this which show the quarterly figure, which is an average of the preceding three months? Eg, suppose we started at adding a column after ‘Dec-21’ called Q4 2021 which took the average of the columns called ‘Oct-21’, ‘No…
Output error if there is value under NaN header in Excel file
I have inputed Excel table, that look like this: I need to output error, because under the NaN header there is value(val_7), but i have no idea how to implement it Answer try: result: it works if you don’t have the word “unnamed” in any of your columns names
Multiple lambda outputs in string replacement using apply [Python]
I have a list of “states” from which I have to iterate: states = [‘antioquia’, ‘boyaca’, ‘cordoba’, ‘choco’] I have to iterate one column in a pandas df to replace or cut the string where the state text is found, so I try: And the result is: Result w…
convert multiple units to KG in pandas
let’s say i have this code(which is obviously wrong) I want to apply such condition to each value in a column(weights) based on another column(weight unit). Is there an efficient way to do it. Preferably allowing a func pass so easy to modify Answer Don’t use a function, this will be slow. numpy.v…
expand row based on integer in column and split into number of months between dates
I have the following dataframe: id date_start date_end reporting_month reporting_month_number months_length 1 2022-03-31 23:56:22 2022-05-01 23:56:22 2022-03 1 3 2 2022-03-31 23:48:48 2022-06-01 23:48:48 2022-03 1 4 3 2022-03-31 23:47:36 2022-08-01 23:47:36 2022-03 1 6 I would like to split each id row so I c…
how to obtain key pair values from an API JSON column – Jupyter Notebook
after exploring one row in an API example, I found the whole information df[‘items’][0] I been using this code to obtain the creation_date values: Here is where I got stuck. I found that some rows doesn’t have last_edit_date values. When I try to run the same code using the name last_edit_da…