I have to write a single function that should return the first word in the following strings:
("Hello world") -> return "Hello" (" a word ") -> return "a" ("don't touch it") -> return "don't" ("greetings, friends") -> return "greetings" ("... and so on ...") -> return "and" ("hi") -> return "hi"
All have to return the first word and as you can see some start with a whitespace, have apostrophes or end with commas.
I’ve used the following options:
return text.split()[0] return re.split(r'w*, text)[0]
Both error at some of the strings, so who can help me???
Advertisement
Answer
It is tricky to distinguish apostrophes which are supposed to be part of a word and single quotes which are punctuation for the syntax. But since your input examples do not show single quotes, I can go with this:
re.match(r'W*(w[^,. !?"]*)', text).groups()[0]
For all your examples, this works. It won’t work for atypical stuff like "'tis all in vain!"
, though. It assumes that words end on commas, dots, spaces, bangs, question marks, and double quotes. This list can be extended on demand (in the brackets).