Skip to content
Advertisement

replace whitespace with comma in multiline string (doc string), but keeping end-of-line

I have a multiline string (and not a text file) like this:

x = '''
Index    Value         Max     Min    State
0    10    nan         nan     nan
1    20    nan         nan     nan    
2    15    nan         nan     nan     
3    25    20          10      1
4    15    25          15      2
5    10    25          15      4
6    15    20          10      3    
'''

The column white spaces are unequal.

I want to replace the whitespace with a comma, but keep the end-of-line.

So the result would look like this:

Index,Value,Max,Min,State
0,10,nan,nan,nan
1,20,nan,nan,nan    
2,15,nan,nan,nan     
3,25,20,10,1
4,15,25,15,2
5,10,25,15,4
6,15,20,10,3    

…or alternatively as a pandas dataframe.

what i have tried

  • I can use replace('') with different spaces, but need to count the white spaces
  • I can use the re module (from here re.sub question ), but it converts the whole string to 1 line, where as i need to keep multiple lines (end-of-line).

Advertisement

Answer

Try with StringIO

from io import StringIO
import pandas as pd


x = '''
Index    Value         Max     Min    State
0    10    nan         nan     nan
1    20    nan         nan     nan    
2    15    nan         nan     nan     
3    25    20          10      1
4    15    25          15      2
5    10    25          15      4
6    15    20          10      3    
'''

df = pd.read_csv(StringIO(x), sep='ss+', engine='python')

   Index  Value   Max   Min  State
0      0     10   NaN   NaN    NaN
1      1     20   NaN   NaN    NaN
2      2     15   NaN   NaN    NaN
3      3     25  20.0  10.0    1.0
4      4     15  25.0  15.0    2.0
5      5     10  25.0  15.0    4.0
6      6     15  20.0  10.0    3.0
User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement