Skip to content
Advertisement

Python regex match string of 8 characters that contain both alphabets and numbers

I am trying to match a string of length 8 containing both numbers and alphabets(cannot have just numbers or just alphabets)using re.findall. The string can start with either letter or alphabet followed by any combination.

e.g.-

Input String: The reference number is 896av6uf and not 87987647 or ahduhsjs or hn0.

Output: ['896av6uf','a96bv6u0']

I came up with this regex r'([a-z]+[d]+[w]*|[d]+[a-z]+[w]*)' however it is giving me strings with less than 8 characters as well. Need to modify the regex to return strings with exactly 8 chars that contain both letters and alphabets.

Advertisement

Answer

A more compact solution than others have suggested is this:

((?![A-Za-z]{8}|[0-9]{8})[0-9A-Za-z]{8})

This guarantees that the found matches are 8 characters in length and that they can not be only numeric or only alphabets.

Breakdown:

  • (?![A-Za-z]{8}|[0-9]{8}) = This is a negative lookahead that means the match can’t be a string of 8 numbers or 8 alphabets.
  • [0-9A-Za-z]{8} = Simple regex saying the input needs to be alphanumeric of 8 characters in length.

Test Case:

Input: 12345678 abcdefgh i8D0jT5Yu6Ms1GNmrmaUjicc1s9D93aQBj3WWWjww54gkiKqOd7Ytkl0MliJy9xadAgcev8b2UKdfGRDOpxRPm30dw9GeEz3WPRO 1234567890987654321 qwertyuiopasdfghjklzxcvbnm

import re

pattern = re.compile(r'((?![A-Za-z]{8}|d{8})[A-Za-zd]{8})')

test = input()
match = pattern.findall(test)
print(match)

Output: ['i8D0jT5Y', 'u6Ms1GNm', 'maUjicc1', 's9D93aQB', 'j3WWWjww', '54gkiKqO', 'd7Ytkl0M', 'liJy9xad', 'Agcev8b2', 'DOpxRPm3', '0dw9GeEz']

User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement