How can I randomly make n% values null in a pandas series? Let’s say I want 20% null values in my dictionary, series, or list.
input something =
JavaScript
x
2
1
{0: 'a', 1: 'b', 2: 'c', 3: 'd', 4: 'e', 5: 'f', 6: 'g', 7: 'h', 8: 'i', 9: 'j'}
2
expected output with 20% null =
JavaScript
1
2
1
{0: 'a', 1: null, 2: 'c', 3: 'd', 4: 'e', 5: 'f', 6: null, 7: 'h', 8: 'i', 9: 'j'}
2
Advertisement
Answer
You can just use series.sample(frac=%)
to index and set the values in original series as None.
JavaScript
1
6
1
s = pd.Series({0: 'a', 1: 'b', 2: 'c', 3: 'd', 4: 'e', 5: 'f', 6: 'g', 7: 'h', 8: 'i', 9: 'j'})
2
3
s[s.sample(frac=0.4).index] = None #Set 40% to None
4
5
print(dict(s))
6
JavaScript
1
2
1
{0: 'a', 1: 'b', 2: None, 3: None, 4: None, 5: 'f', 6: 'g', 7: 'h', 8: None, 9: 'j'}
2