Skip to content

Tag: pandas

Resolving conflicts in Pandas dataframe

I am performing record linkage on a dataframe such as: When my model overpredicts and links the same ID_1 to more than one ID_2 (indicated by a 1 in Predicted Link) I want to resolve the conflicts based on the Probability-value. If one predicted link has a higher probability than the other I want to keep a 1 …

Counting number of rows between min and max in pandas

I have a simple question in pandas. Lets say I have following data: How do I count the number of rows which are between minimum and maximum value in column a? So number of rows (it is 3 in this case) which are between 1 and 10 in this particular case? Thanks Answer IIUC, you could get the index of

Merge 2 columns from a single Dataframe in Pandas

I want to merge 2 columns of the same dataframe, but by using some specific condition. consider the following dataframe : number-first Number-second 1 Nan 2 4C 3A 5 Nan 6 Nan 7 Nan Nan The conditions are: If the Number-first column has a alphanumeric value and the Number-second Column has a Nan value or a &#8…

Pandas/Geopandas Merge with a mask selection

I usually work with Arcpy but am trying to learn more pandas/geopandas uses. I have a mask applied to a csv table and a shapefile that I want to merge together in order to find matches between the two based on a specific field. However, when I try to merge them together, I get the error “The truth value…

How to find lines in pandas columns with close values?

I need to find ‘user_id’ of users standing closeby to each other. So we have data: So, in this dataset it would be users with id ‘101’ and ‘302’. But our dataset has millions of lines in it. Are there any built-in functions in pandas or python to solve the issue? Answer Ass…

How to take results of GROUPBY and expand to columns

Here’s an example of my working code that accomplishes what I want but is much too long. The resulting data frame is in the format I want here: I know that I can accomplish close to what I want with a groupby() on the ‘Specialty Type’ and the ‘Foundation Threshold’ using an .nuni…