How can i make a look ahead that ignores a subset?

I’m trying to isolate some fashion sizes in a string into two individual datapoints.

Ie. if i get XXS/XS i want to get “XXS” isolated and then with a second regex get “XS”.

For that i have just used group 1 on

([A-Za-z0-9]+)

for the first size and

(?:[^A-Za-z0-9()]+([A-Za-z0-9]+))

to grab the second size

However i have come into some new weird combinations being a value:

S1 – XXS/XS-S/M

How would i go about ensuring that my regex in this case grabs XXS/XS as the first value and S/M as the second value, while also ensuring that i also get XXS as the first value and XS as the second value if the string was simply “XXS/XS” OR 40 and 42 with “40-42”?

Simply can’t figure out the best way to do this – hope some bright minds know a great way to do so! I can system wise “only” support one regex for getting the first value and a second regex for getting the secondary value. My system only catches group 1.

My code is based around python, so usually use pythex.org to validate that it works in python.

Answer

I think you can use

(?i)([a-zd]+(?:[/-][a-zd]+)?)[/-]([a-zd]+(?:[/-][a-zd]+)?)

See the regex demo.

Note the regex above contains two capturing groups that hold your necessary values.

If you need to get Group 1 always, then you need to simply remove the unnecessary pair of parentheses from the regex above and keep just a single capturing group.

Details

  • (?i) – case insensitive inline modifier option
  • ([a-zd]+(?:[/-][a-zd]+)?) – one or more alphanumeric chars, and then an optional occurrence of / or - followed with one or more alphanumeric chars
  • [/-] – a / or -
  • ([a-zd]+(?:[/-][a-zd]+)?) – Group 2: (same as the first group pattern).