Tag: grouping

Clustering different sets of points with different linear relationships to each other in Python

cluster-analysis grouping intercept linear-regression python

I need to cluster groups of points with the same linear relationship, as per the code and figure below. Obviously, I wouldn’t have the points that way; I would just have the following x and y. Note the following: the points respect linear relationships with high slope, they present a slight separation from each other, and they all have the

Python – Group(Cluster/Sort) arrays based on ranking information

arrays grouping numpy python sorting

I have a dataframe looks like this: I converted the dataframe into 2D arrays like this: The score of each row 1-5 actually means the people give the scores to item A, B, C, D. I would like to identify the people who have the same ranking, for example the people think A > B > C > D. And

inserting missing categories and dates in pandas dataframe

dataframe grouping pandas python

I have the following data frame. I want to add in all score levels (high, mid, low), for each group (a, b, c, d), for all dates (there are two dates – 2020-06-01 and 2020-06-02) I can add in the score categories for all subjects with the following, but i am having trouble adding date in as well the expected

Grouping / clustering a list of numbers so that the min-max gap of each subset is always less than a cutoff in Python

algorithm cluster-analysis grouping python subset

Say I have a list of 50 random numbers. I want to group the numbers in a way that each subset has a min-max gap less than a cutoff 0.05. Below is my code. Check if all subsets have min-max gaps less than the cutoff: Output: Obviously my code is not working. Any suggestions? Answer Following @j_random_hacker’s answer, I simply

How to group data from a list of namedtuples

grouping namedtuple python

In python, I have the following data in a list of namedtuple in memory: I want to group the data by : cluster cluster and host cluster and host and database cluster and host and database and diskgroup I won’t need the disk details. In each group I want to : sum the values of read_bytes_per_sec and write_bytes_per_sec compute the