Skip to content

Tag: pandas

replicating data in same dataFrame

I want to replicate the data from the same dataframe when a certain condition is fulfilled. Dataframe: I want to replicate the dataframe when going through a loop and there is a difference greater than 4 in row.hour. Expected Output: i want to replicate the rows when the iterating through all the row and ther…

Dask dataframe crashes

I’m loading a large parquet dataframe using Dask but can’t seem to be able to do anything with it without the system crashing on me or getting a million errors and no output. The data weighs about 165M compressed, or 13G once loaded in pandas (it fits well in the 45G RAM available). Instead, if us…

Set dictionary keys as cells in dataframe column

Please look at my code: Here I convert dictionary to DataFrame and set index as new column. Can it be done in 1 line at the stage of converting a dictionary to a date without I want to immediately recognize the major indices as cells of the new column. Something like Answer You can just reset_index() to creat…

Return the last non-zero value in a panda df

I have a dataframe The logic is if col1 is not zero, return col1. If col 1 is zero, return col2 (non-zero). If col 2 is zero, return col3. We don’t need to do anything for col4 My code looks like below but it only returns col1 I tried .any() and .all(), it doesnt work either. Also, is there anyway

null out n% values in series dictionary python

How can I randomly make n% values null in a pandas series? Let’s say I want 20% null values in my dictionary, series, or list. input something = expected output with 20% null = Answer You can just use series.sample(frac=%) to index and set the values in original series as None.