creating random values but within another column restraints [python]

Question

I have this type of dataset: I want to have: As you can see, if I just went and created random values for the Gender column, I will eventually have a problem: I might assign different gender names to the same person ID. If I had unique IDs, then, that wouldn't have been a problem. But I want to create

Accepted Answer

using random.choice and .replace:# dummy datadf = pd.DataFrame()df['ID'] = np.random.randint(0,10, 100)#create dict that maps id to random gendergenders = {i: np.random.choice(['F', 'M']) for i in df['ID'].unique()}df['gender'] = df['ID'].replace(genders)

Advertisement

Answer