I am reading one big csv file line by line and I want to count the no. of delimiters in each line.
But if the delimiter is part of data value, then it should not be counted.
Few records of data set:
JavaScript
x
4
1
com.abc.xyz, ple Sara, "DIT, Government of Maharashtra, India"
2
com.mtt.rder, News Maharashtra, Time Internet Limited"
3
com.grner.mahya, Mh Swth, "Public Health Department, Maharashtra"
4
In all 3 lines, number of actual commas (which divides the data into multiple columns) are only 2
but below code snippet outputs
- 4 commas for line 1
- 2 for line 2
- 3 for line 3
Code Snippet:
JavaScript
1
11
11
1
file1 = open('file_name.csv', 'r')
2
3
while True:
4
5
line = file1.readline()
6
7
if not line:
8
break
9
10
print(line.count(','))
11
Advertisement
Answer
One simple way could be to use regex and remove everything between two "
, so that the commas inside aren’t counted.
JavaScript
1
10
10
1
import re
2
file1 = open('input.csv', 'r')
3
4
while True:
5
line = file1.readline()
6
if not line:
7
break
8
line = re.sub('".*?"', '', line)
9
print(line.count(','))
10
Output:
JavaScript
1
4
1
2
2
2
3
2
4