I have the following data (simple representation of black particles on a white filter):
JavaScript
x
11
11
1
data = [
2
[0, 0, 0, 255, 255, 255, 0, 0],
3
[0, 255, 0, 255, 255, 255, 0, 0],
4
[0, 0, 0, 255, 255, 255, 0, 0, ],
5
[0, 0, 0, 0, 255, 0, 0, 0],
6
[0, 255, 255, 0, 0, 255, 0, 0],
7
[0, 255, 0, 0, 0, 255, 0, 0],
8
[0, 0, 0, 0, 0, 255, 0, 0],
9
[0, 0, 0, 0, 0, 255, 0, 0]
10
]
11
And I have counted the number of particles (groups) and assigned them each a number using the following code:
JavaScript
1
4
1
arr = np.array(data)
2
groups, group_count = measure.label(arr > 0, return_num = True, connectivity = 1)
3
print('Groups: n', groups)
4
With the Output:
JavaScript
1
10
10
1
Groups:
2
[[0 0 0 1 1 1 0 0]
3
[0 2 0 1 1 1 0 0]
4
[0 0 0 1 1 1 0 0]
5
[0 0 0 0 1 0 0 0]
6
[0 3 3 0 0 4 0 0]
7
[0 3 0 0 0 4 0 0]
8
[0 0 0 0 0 4 0 0]
9
[0 0 0 0 0 4 0 0]]
10
I then have four (4) particles (groups) of different sizes.
I am looking to create a DataFrame representing each particle. Like this:
Any help is much appreciated!
Advertisement
Answer
There should be a more elegant approach, but here is what I have come up with:
JavaScript
1
12
12
1
import pandas as pd
2
customDict = {}
3
for group in groups:
4
for value in group:
5
if str(value) not in customDict:
6
customDict[str(value)] = [0]
7
customDict[str(value)][0] += 1
8
df = pd.DataFrame.from_dict(customDict, orient="index").reset_index()
9
df.rename(columns={"index": "particle #", 0: "size"}, inplace=True)
10
df.drop(0, inplace=True)
11
df
12
Output
particle # | size | |
---|---|---|
1 | 1 | 10 |
2 | 2 | 1 |
3 | 3 | 3 |
4 | 4 | 4 |