Skip to content
Advertisement

How to save pandas textmanipulation as csv in the correct form

I have a *.txt file with numbers. I want to eliminate the spaces. The raw data looks like this

12 12345 1234
23 23456 234

If I use the following

data=data[0].str.replace(" ","")
data.update(''' + data + '',')

I get

 '1234123451234',
 '2323456234',

which I want. But if I save it to csv with

data.to_csv("/Users/k/file.txt", header=None, index=None, mode='a')

I get as file values:

 "'1234123451234',"
 "'2323456234',"

If I use the quoating = csv.None or 3 (same)

data.to_csv("Users/k/file.txt", header=None, index=None, quoting=3, escapechar="\", mode='a')

The file looks like:

 '1234123451234',
 '2323456234',

Just using space or nothing as escapechar does not work.

If I just remove the spaces without adding quotes or commas and then save via:

data.to_csv("Users/k/file.txt", header=None, index=None, mode='a', quoting=1, sep=",")

I get:

 "1234123451234"
 "2323456234"

missing the comma.

Adding only the comma and saving as above gets me

 "1234123451234,"
 "2323456234,"

wrong place :-)

As you can see, I am getting mad over missing my target by inches, while it is most likely super easy. I probably will switch to regex :-)

Advertisement

Answer

The output is expected. As you manually added ' to the strings, they are retained in the output. There’s no need to manually add quotation marks at all, just set the correct options for to_csv:

>>> df = pd.DataFrame(["12 12345 1234", "23 23456 234"])
>>> df
               0
0  12 12345 1234
1   23 23456 234

>>> df[0] = df[0].str.replace(" ", "")
>>> df
             0
0  12123451234
1   2323456234

>>> df.to_csv("output.csv", quoting=1, quotechar="'", line_terminator=',n')

# output.csv
'','0',
'0','12123451234',
'1','2323456234',
User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement