Here’s a string. I want to remove a C-style comments with the comments itself. Without using regex
a = "word234 /*12aaa12*/"
I want the output to be just:
word234
Advertisement
Answer
Here is a simple algorithm that keep the state over 2 characters and uses a flag to keep or not the characters.
a = "word234 /*12aaa12*/ word123 /*xx*xx*/ end"
out = []
add = True
prev = None
for c in a:
if c == '*' and prev == '/':
if add:
del out[-1]
add = False
if c == '/' and prev == '*':
add = True
prev = c
continue
prev = c
if add:
out.append(c)
s2 = ''.join(out)
print(s2)
Output:
word234 word123 end
If you want to handle nested comments (not sure if this exists, but this is fun to do), the algorithm is easy to modify to use a flag that counts the depth level:
a = "word234 /*12aaa12*/ word123 /*xx/*yy*/xx*/ end"
out = []
lvl = 0
prev = None
for c in a:
if c == '*' and prev == '/':
if lvl == 0:
del out[-1]
lvl -= 1
if c == '/' and prev == '*':
lvl += 1
prev = c
continue
prev = c
if lvl == 0:
out.append(c)
s2 = ''.join(out)
print(s2)