I have strings that includes names and sometime a username in a string followed by a datetime stamp:
GN1RLWFH0546-2020-04-10-18-09-52-563945.txt JOHN-DOE-2020-04-10-18-09-52-563946t64.txt DESKTOP-OHK45JO-2020-04-09-02-27-11-451975.txt
I want to extract the usernames from this string:
GN1RLWFH0546 JOHN-DOE DESKTOP-OHK45JO
I have tried different regex patterns the closest I came to extract was following:
GN1RLWFH0546 DESKTOP JOHN
Using the following regex pattern:
names = re.search(r"(?([0-9A-Za-z]+))?", agent_str) print(names.group(1))
Advertisement
Answer
You may get all text up to the first occurrence of -+digits+-:
^.*?(?=-d+-)
If the number must be exactly 4 digits (say, if it is a year), then replace + with {4}:
^.*?(?=-d{4}-)
See the regex demo
Details
- ^– start of string
- .*?– any 0+ chars other than line break chars, as few as possible
- (?=-d+-)– up to the first occurrence of- -and 1+ digits (or, if- d{4}is used, exactly four digits) and then- -(this part is not added to the match value as the positive lookahead is a non-consuming pattern).
See Python demo:
import re
strs = ["GN1RLWFH0546-2020-04-10-18-09-52-563945.txt", "JOHN-DOE-2020-04-10-18-09-52-563946t64.txt", "DESKTOP-OHK45JO-2020-04-09-02-27-11-451975.txt"]
rx = re.compile(r"^.*?(?=-d+-)")
for s in strs:
  m = rx.search(s)
  if m:
    print("{} => '{}'".format(s, m.group()))
Output:
GN1RLWFH0546-2020-04-10-18-09-52-563945.txt => 'GN1RLWFH0546' JOHN-DOE-2020-04-10-18-09-52-563946t64.txt => 'JOHN-DOE' DESKTOP-OHK45JO-2020-04-09-02-27-11-451975.txt => 'DESKTOP-OHK45JO'