Skip to content
Advertisement

Regex find hash comments exclude curly brackets

I want to find the hash comments that don’t contain the text inside curly braces. But if it starts with #, it’s all comments.

Input:

# This is comment 1
TEST # This is comment 2
{  # Not comment}; { # Not comment } # This is example 3

# This is comment 4
TEST Not comment
{  # Not comment}; { # Not comment } # This is comment 5

{# Not comment}; { # Not comment }={# Not comment} {# Not comment}={# Not comment} {# Not comment}={# Not comment} # This is comment 6
# This is comment 7 {# This is comment 8}

# This is comment 9

Not comment {Not comment}={# Not comment} 

Output:

# This is comment 1
# This is comment 2
# This is comment 3
# This is comment 4
# This is comment 5
# This is comment 6
# This is comment 7 {# This is comment 8}
# This is comment 9

I want to be able to achieve what I want with one line of regular expression. Currently I can only find all strings starting with # by {.*?}(*SKIP)(*F)|#[^#{}n].* However it doesn’t work in python(I hope not PCRE).

If possible I hope it can be done without installing the module ‘regex’ to parsing PCRE(Perl Compatible Regular Expressions).

Advertisement

Answer

You can fix begin and end of line separators at the boundaries of your regex, and then look for:

  • all strings that start with a # and that don’t include new lines: "(#[^n]+)"

optionally precedeed by:

  • any character (other than a new line) followed by a space "(.+ )?"

Here’s the final regex:

"(^|n)(.+ )?(#[^n]+)"

Tested at https://regex101.com. Use backreference to retrieve your comment within Group 3.

Does it work for you?

User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement