Skip to content
Advertisement

Regex: allow comma-separated strings, including characters and non-characters

I’m finding it difficult to complete this regex.

The following regex checks for the validity of comma-separated strings: ^(w+)(,s*w+)*$

So, this will match the following comma-separated strings:

import re
pattern = re.compile(r"^(w+)(,s*w+)*$")
valid_string = "foo, bar, hey,friend, 56, 7, elephant"
pattern.match(valid_string)

Then, I can do the same for non-characters, using ^(W+)(,s*W+)*$, which will match:

import re
pattern = re.compile(r"^(W+)(,s*W+)*$")
valid_string = "%, $, *, $$"
pattern.match(valid_string)

I would like to create a regex which matches strings which include special characters, and hyphens and underscore, e.g.

foo-bar, hey_friend, 56-8, 7_88, elephant$n

How could I “combine” /w and /W to accomplish this?

EDIT: Here are some examples of invalid strings:

invalid1 = "aa, b, c d e"

This is invalid, as it is space-separated—it must be comma-separated.

Here’s another example:

invalid2 = "a, ,b, c, d"

This is invalid, as there are two commas; there must only be one.

Advertisement

Answer

You can use

^[^s,]+(?:,s*[^s,]+)*$

See the regex demo

Details

  • ^ – start of string
  • [^s,]+ – 1 or more chars other than whitespace and commas
  • (?:,s*[^s,]+)* – 0 or more occurrences of
    • , – a comma
    • s* – 0+ whitespace chars
    • [^s,]+ – 1 or more chars other than whitespace and commas
  • $ – end of string.
Advertisement