Basic Example:
JavaScript
x
13
13
1
# Given params such as:
2
params = {
3
'cols': 8,
4
'rows': 4,
5
'n': 4
6
}
7
# I'd like to produce (or equivalent):
8
col0 col1 col2 col3 col4 col5 col6 col7
9
row_0 0 1 2 3 0 1 2 3
10
row_1 1 2 3 0 1 2 3 0
11
row_2 2 3 0 1 2 3 0 1
12
row_3 3 0 1 2 3 0 1 2
13
Axis Value Counts:
- Where the axis all have an equal distribution of values
JavaScript
1
8
1
df.apply(lambda x: x.value_counts(), axis=1)
2
3
0 1 2 3
4
row_0 2 2 2 2
5
row_1 2 2 2 2
6
row_2 2 2 2 2
7
row_3 2 2 2 2
8
JavaScript
1
8
1
df.apply(lambda x: x.value_counts())
2
3
col0 col1 col2 col3 col4 col5 col6 col7
4
0 1 1 1 1 1 1 1 1
5
1 1 1 1 1 1 1 1 1
6
2 1 1 1 1 1 1 1 1
7
3 1 1 1 1 1 1 1 1
8
My attempt thus far:
JavaScript
1
18
18
1
import itertools
2
import pandas as pd
3
4
def create_df(cols, rows, n):
5
x = itertools.cycle(list(itertools.permutations(range(n))))
6
df = pd.DataFrame(index=range(rows), columns=range(cols))
7
df[:] = np.reshape([next(x) for _ in range((rows*cols)//n)], (rows, cols))
8
#df = df.T.add_prefix('row_').T
9
#df = df.add_prefix('col_')
10
return df
11
12
params = {
13
'cols': 8,
14
'rows': 4,
15
'n': 4
16
}
17
df = create_df(**params)
18
Output:
JavaScript
1
22
22
1
0 1 2 3 4 5 6 7
2
0 0 1 2 3 0 1 3 2
3
1 0 2 1 3 0 2 3 1
4
2 0 3 1 2 0 3 2 1
5
3 1 0 2 3 1 0 3 2
6
7
# Correct on this Axis:
8
>>> df.apply(lambda x: x.value_counts(), axis=1)
9
0 1 2 3
10
0 2 2 2 2
11
1 2 2 2 2
12
2 2 2 2 2
13
3 2 2 2 2
14
15
# Incorrect on this Axis:
16
>>> df.apply(lambda x: x.value_counts())
17
0 1 2 3 4 5 6 7
18
0 3.0 1 NaN NaN 3.0 1 NaN NaN
19
1 1.0 1 2.0 NaN 1.0 1 NaN 2.0
20
2 NaN 1 2.0 1.0 NaN 1 1.0 2.0
21
3 NaN 1 NaN 3.0 NaN 1 3.0 NaN
22
So, I have the conditions I need on one axis, but not on the other.
How can I update my method/create a method to meet both conditions?
Advertisement
Answer
You can tile
you input and use a custom roll to shift each row independently:
JavaScript
1
16
16
1
c = params['cols']
2
r = params['rows']
3
n = params['n']
4
a = np.arange(params['n']) # or any input
5
6
b = np.tile(a, (r, c//n))
7
# array([[0, 1, 2, 3, 0, 1, 2, 3],
8
# [0, 1, 2, 3, 0, 1, 2, 3],
9
# [0, 1, 2, 3, 0, 1, 2, 3],
10
# [0, 1, 2, 3, 0, 1, 2, 3]])
11
12
idx = np.arange(r)[:, None]
13
shift = (np.tile(np.arange(c), (r, 1)) - np.arange(r)[:, None])
14
15
df = pd.DataFrame(b[idx, shift])
16
Output:
JavaScript
1
6
1
0 1 2 3 4 5 6 7
2
0 0 1 2 3 0 1 2 3
3
1 3 0 1 2 3 0 1 2
4
2 2 3 0 1 2 3 0 1
5
3 1 2 3 0 1 2 3 0
6
Alternative order:
JavaScript
1
5
1
idx = np.arange(r)[:, None]
2
shift = (np.tile(np.arange(c), (r, 1)) + np.arange(r)[:, None]) % c
3
4
df = pd.DataFrame(b[idx, shift])
5
Output:
JavaScript
1
6
1
0 1 2 3 4 5 6 7
2
0 0 1 2 3 0 1 2 3
3
1 1 2 3 0 1 2 3 0
4
2 2 3 0 1 2 3 0 1
5
3 3 0 1 2 3 0 1 2
6
Other alternative: use a custom strided_indexing_roll
function.