I am trying to modify the following dataset in python 3/pandas
JavaScript
x
10
10
1
Rank Maj Rank Maj Rank Maj Rank Maj Rank Maj Rank Maj
2
0 2.00 31.92 3.00 0.00 4.00 33.72 5.00 24.89 6.00 0.00.1 7.00 148.35
3
1 8 28.26 9 0 10 5.96 11 7.66 12 0 13 6.19
4
2 14 5.63 15 0 16 17.43 17 26.73 18 0 19 84.7
5
3 20 25.98 21 0 22 8.65 23 6.38 24 0 25 3.98
6
4 26 2.44 27 0 28 3.43 29 2.75 30 0 31 1.8
7
5 32 1.46 33 0 34 1.79 35 2.49 36 0 37 2.51
8
6 38 1.85 39 0 40 1.48 41 1.05 42 0 43 0.56
9
7 44 0.36 45 0 46 0.31 47 0.2 49 0.32 50 0.2
10
into a dataframe that will have the first columns or index to be the rank and the second column all the Maj value. Something like that:
JavaScript
1
18
18
1
Rank Maj
2
2.00 31.92
3
8 28.26
4
14 5.63
5
20 25.98
6
26 2.44
7
32 1.46
8
38 1.85
9
44 0.36
10
3.00 0.00
11
9 0
12
15 0
13
21 0
14
27 0
15
33 0
16
39 0
17
45 0
18
…
JavaScript
1
8
1
13 6.19
2
19 84.7
3
25 3.98
4
31 1.8
5
37 2.51
6
43 0.56
7
50 0.2
8
I am trying to do that with a table pivot:
JavaScript
1
2
1
table.pivot_table(index = "Rank", columns = "Maj")
2
But get the following error:
JavaScript
1
13
13
1
Traceback (most recent call last):
2
File "ReadReport.py", line 42, in <module>
3
table.pivot_table(index = "Rank", columns = "Maj")
4
File "C:Python38-32libsite-packagespandascoreframe.py", line 6070, in pivot_table
5
return pivot_table(
6
File "C:Python38-32libsite-packagespandascorereshapepivot.py", line 95, in pivot_table
7
values = values.drop(key)
8
File "C:Python38-32libsite-packagespandascoreindexesbase.py", line 5013, in drop
9
indexer = self.get_indexer(labels)
10
File "C:Python38-32libsite-packagespandascoreindexesbase.py", line 2733, in get_indexer
11
raise InvalidIndexError(
12
pandas.core.indexes.base.InvalidIndexError: Reindexing only valid with uniquely valued Index objects
13
But i do not have any duplicated value in the Rank. It goes from 2 to 50.
My main goal is to print Rank over Maj.
Thanks for your help
Advertisement
Answer
You can use np.reshape
:
JavaScript
1
16
16
1
print (pd.DataFrame(df.to_numpy().reshape((-1, 2)), columns=["Rank", "Maj"]))
2
3
Rank Maj
4
0 2 31.92
5
1 3 0
6
2 4 33.72
7
3 5 24.89
8
4 6 0.00.1
9
5 7 148.35
10
6 8 28.26
11
7 9 0
12
8 10 5.96
13
9 11 7.66
14
15
16