Regex to find expressions whose every odd or even character is some character

I want to check if a string is composed such that every second character is e (or space), starting either at any of index 0 or 1.

I am unfamiliar with regex enough to come up with something as even a start.

Accepted strings:

Aebece
Aebec
eAebec
eAebece
e
eAe
Aeb


Rejected

A
Ab
Aeeb
Aeee
ee
eee
eeAebec
eA

Aaebbecce
Aaebbecc
(And so on, meaning each of a,b,c can NOT be a word of any length not containing `e`, but has to be a single character)

I believe this defines it, but surely someone will run into an edge case that didn’t occur to me.

I used ‘e’ instead of space for visual ease.
In plain english, this is supposed to catch “space separated words” such as “W o W”, “S W E E T ” and so on, but not sentences such as “H O L Y S M O K E S” (notice double space).

I hope this is clear.

Answer

A simpler way of doing this would be to extract both even and odd characters in two separate arrays, collapse the arrays, and verify their length:

''.join(set("H O L Y S M O K E S"[1::2])) (without double space)

returns ” “, a single space.

''.join(set("H O L Y  S M O K E S"[1::2])) (without double space)

returns ” EKMOS”.

The problem will lie in multibyte characters, when the “[1::2]” trick will not work (I don’t think it will work with regex either, because re.findall(r'(.)', "Cioè") yields ['C', 'i', 'o', 'xc3', 'xa8'] instead of [ 'C', 'i', 'o', 'è' ]).

Regex

If you need a regex, then:

^(?:(?:[ e].)*[ e]?|(?:.[ e])*.?)$

This means that between the beginning and the end of the string there must be either (?:[ e].)*[ e]? (a “space/e plus anything” pair), repeated, optionally followed by one space/e); or (?:.[ e])*.?, a whatever followed by space/e, possibly repeated, optionally followed by a character.

This is not exactly equivalent to your request because it will accept a word separated by both spaces and e’s: “HeOeL Y S M O K EeS” is good. To have either all spaces or all e’s, you need

^(?:(?: .)* ?|(?:. )*.?)|(?:.e)*.?)|(?:e.)e?)$

to cover the four cases (space-separated beginning at 0, at 1, e-separated at 0, and e-separated at 1).