Category: Questions

How to remove a group of specific rows from a dataframe?

I have a dataframe with 7581 rows and 3 columns (id,text,label). And I have a subgroup of this dataframe of 794 rows. What I need to do is to remove that subgroup of 794 rows (same labels) from the big dataframe of 7581. This is how the subgroup looks like: Photo I have tried to do this: But the following

How do i df.fillna with category median values

fillna median pandas python

I have a large dataset ~1mln rows, and about 5000 absent coordinates(i’d like to fill them with median value by category ‘city’everything but fillna is working, how to make it happen? Answer You could do: First groupby with the city, then use transform with fillna and calculate the median. (you could use any mathematical operation)

Pandas dataframe – fillna with last of next month

dataframe datetime pandas python

I’ve been staring at this way too long and I think Ive lost my mind, it really shouldn’t be as complicated as I’m making it. I have a df: Date1 Date2 2022-04-01 2022-06-17 2022-04-15 2022-04-15 2022-03-03 NaT 2022-04-22 NaT 2022-05-06 2022-06-06 I want to fill the blanks in ‘Date2’ where it keeps the values from ‘Date2’ if they are present

Is it possibe to change similar libraries (Data Analysis) in Python within the same code?

dataframe modin pandas python

I use the modin library for multiprocessing. While the library is great for faster processing, it fails at merge and I would like to revert to default pandas in between the code. I understand as per PEP 8: E402 conventions, import should be declared once and at the top of the code however my case would need otherwise. Then I

How can I prepare my image dataset for a federated model?

python scikit-learn tensorflow-federated

How could I transform my dataset (composed of images) in a federated dataset? I am trying to create something similar to emnist but for my own dataset. tff.simulation.datasets.emnist.load_data( only_digits=True, cache_dir=None ) Answer You will need to create the clientData object first for example: where create_dataset is a serializable function but first you have to prepare your images read this tutorial

Faster alternative to groupby, unstack then fillna

dataframe fillna pandas python

I’m currently doing the following operations based on a dataframe (A) made of two columns with multiple thousands of unique values each. The operations performed on this dataframe are: The output is a table (B) with unique values of col1 in rows and unique values of col2 in columns, and each cell is the count of rows, from the original

Convert string duration column to seconds

datetime pandas python string

In the dataframe, one of the columns is duration. It was given as a string. How can I convert this column into seconds? Answer Use pd.Timedelta to parse each item: Output:

Django technology implementation of line breaks by number of posts

blogs bootstrap-4 django post python

My problem I am making a shopping mall using Django, Bootstrap I want to implement technology line break when the post becomes 4 I thought if I used col-3 and {$ for %} {% endfor %} it would be divided into four and line breaks. but not working How can i fix it? MY models My views My urls My

Replace value based on a corresponding value but keep value if criteria not met

apply arrays dataframe pandas python

Given the following dataframe, INPUT df: Cost_centre Pool_costs 90272 A 92705 A 98754 A 91350 A Replace Pool_costs value with ‘B’ given the Cost_centre value but keep the Pool_costs value if the Cost_centre value does not appear in list. OUTPUT df: Cost_centre Pool_costs 90272 B 92705 A 98754 A 91350 B Current Code: This code works up until the else

IndexError: shape mismatch: indexing arrays could not be broadcast together with shapes (2,) (3,)

arrays numpy numpy-indexing numpy-ndarray python

I have an np.ndarray of shape (5, 5, 2, 2, 2, 10, 8) named table. I can succesfully slice it like this: But for some reason when I try to specify three values for dimension 5 (of length 10) like this: I get: The same is for: This does not happen with: which output the correct result. I tried to