I need to cluster groups of points with the same linear relationship, as per the code and figure below. Obviously, I wouldn’t have the points that way; I would just have the following x and y. Note the following: the points respect linear relationships with high slope, they present a slight separation from each other, and they all have the
Tag: grouping
Python – Group(Cluster/Sort) arrays based on ranking information
I have a dataframe looks like this: I converted the dataframe into 2D arrays like this: The score of each row 1-5 actually means the people give the scores to item A, B, C, D. I would like to identify the people who have the same ranking, for example the people think A > B > C > D. And
inserting missing categories and dates in pandas dataframe
I have the following data frame. I want to add in all score levels (high, mid, low), for each group (a, b, c, d), for all dates (there are two dates – 2020-06-01 and 2020-06-02) I can add in the score categories for all subjects with the following, but i am having trouble adding date in as well the expected
Grouping / clustering a list of numbers so that the min-max gap of each subset is always less than a cutoff in Python
Say I have a list of 50 random numbers. I want to group the numbers in a way that each subset has a min-max gap less than a cutoff 0.05. Below is my code. Check if all subsets have min-max gaps less than the cutoff: Output: Obviously my code is not working. Any suggestions? Answer Following @j_random_hacker’s answer, I simply
How to group data from a list of namedtuples
In python, I have the following data in a list of namedtuple in memory: I want to group the data by : cluster cluster and host cluster and host and database cluster and host and database and diskgroup I won’t need the disk details. In each group I want to : sum the values of read_bytes_per_sec and write_bytes_per_sec compute the