Skip to content
Advertisement

Print side by side .describe() in pandas

Hello so i have two columns that im using describe() and im getting their stats. I have something like this

x=pd.Series([1,3,4,6,7])        
y=pd.Series([75,324,234,42])
desk1=x.describe()
desk2=y.describe()

I want to print desk1 and desk2 below of each category.I am doing this:

print("desk1 stats",end="tt")
print("desk1 stats")
print(desk1,end="tt")
print(desk2)

I get this :

desk1 stats     desk1 stats
count    5.000000
mean     4.200000
std      2.387467
min      1.000000
25%      3.000000
50%      4.000000
75%      6.000000
max      7.000000
dtype: float64      count      4.000000
mean     168.750000
std      133.185022
min       42.000000
25%       66.750000
50%      154.500000
75%      256.500000
max      324.000000
dtype: float64

And i my desired output is this:

desk1 stats     desk1 stats
count    5.000000  count      4.000000
mean     4.200000  mean     168.750000
std      2.387467  std      133.185022
min      1.000000  min       42.000000
25%      3.000000  25%       66.750000
50%      4.000000  50%      154.500000
75%      6.000000  75%      256.500000
max      7.000000  max      324.000000
dtype: float64     dtype: float64

I would like to not create a dataframe.Any solutions? Thanks in advance

Advertisement

Answer

What about using:

print(pd.concat([desk1, desk2],
                 keys=['desk1 description', 'desk2 description'],
                 axis=1))

output:

       desk1 description  desk2 description
count           5.000000           4.000000
mean            4.200000         168.750000
std             2.387467         133.185022
min             1.000000          42.000000
25%             3.000000          66.750000
50%             4.000000         154.500000
75%             6.000000         256.500000
max             7.000000         324.000000

For a more pure python solution:

desks = [desk1, desk2]

print('n'.join(map('    '.join, zip(*(d.to_string().split('n')
                for d in desks)))))

output:

count    5.000000    count      4.000000
mean     4.200000    mean     168.750000
std      2.387467    std      133.185022
min      1.000000    min       42.000000
25%      3.000000    25%       66.750000
50%      4.000000    50%      154.500000
75%      6.000000    75%      256.500000
max      7.000000    max      324.000000

intermediates of second solution:

[d.to_string().split('n') for d in desks]

[['count    5.000000', 'mean     4.200000', 'std      2.387467', 'min      1.000000', '25%      3.000000', '50%      4.000000', '75%      6.000000', 'max      7.000000'],
 ['count      4.000000', 'mean     168.750000', 'std      133.185022', 'min       42.000000', '25%       66.750000', '50%      154.500000', '75%      256.500000', 'max      324.000000']]

list(zip(*(d.to_string().split('n') for d in desks)))

[('count    5.000000', 'count      4.000000'),
 ('mean     4.200000', 'mean     168.750000'),
 ('std      2.387467', 'std      133.185022'),
 ('min      1.000000', 'min       42.000000'),
 ('25%      3.000000', '25%       66.750000'),
 ('50%      4.000000', '50%      154.500000'),
 ('75%      6.000000', '75%      256.500000'),
 ('max      7.000000', 'max      324.000000')]

list(map('    '.join, zip(*(d.to_string().split('n') for d in desks))))

['count    5.000000    count      4.000000',
 'mean     4.200000    mean     168.750000',
 'std      2.387467    std      133.185022',
 'min      1.000000    min       42.000000',
 '25%      3.000000    25%       66.750000',
 '50%      4.000000    50%      154.500000',
 '75%      6.000000    75%      256.500000',
 'max      7.000000    max      324.000000']
User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement