Say that I have a dataframe that looks like:
JavaScript
x
8
1
Name Group_Id
2
AAA 1
3
ABC 1
4
CCC 2
5
XYZ 2
6
DEF 3
7
YYH 3
8
How could I randomly select one (or more) row for each Group_Id
? Say that I want one random draw per Group_Id
, I would get:
JavaScript
1
5
1
Name Group_Id
2
AAA 1
3
XYZ 2
4
DEF 3
5
Advertisement
Answer
JavaScript
1
5
1
size = 2 # sample size
2
replace = True # with replacement
3
fn = lambda obj: obj.loc[np.random.choice(obj.index, size, replace),:]
4
df.groupby('Group_Id', as_index=False).apply(fn)
5