Skip to content
Advertisement

Python, remove all non-alphabet chars from string

I am writing a python MapReduce word count program. Problem is that there are many non-alphabet chars strewn about in the data, I have found this post Stripping everything but alphanumeric chars from a string in Python which shows a nice solution using regex, but I am not sure how to implement it

JavaScript

I’m afraid I am not sure how to use the library re or even regex for that matter. I am not sure how to apply the regex pattern to the incoming string (line of a book) v properly to retrieve the new line without any non-alphanumeric chars.

Suggestions?

Advertisement

Answer

Use re.sub

JavaScript

Alternatively, if you only want to remove a certain set of characters (as an apostrophe might be okay in your input…)

JavaScript
Advertisement