How to determine syllables in a word by using regular expression

Given I have a story. The story consists of words. I need to construct a regular expression to count the number of syllables for each word in a story.

I try to construct a regular expression where the following is met:

IF word ends with character 'e'
AND word also contains at least one of the vowel characters 'a'|'e'|'i'|'o'|'u'|'y'
THEN do not match 'e' at the end of word
BUT match all the other vowels in word
IF word contains only a lone 'e' at the end of a word
AND word does not contain other vowel characters
THEN match the lone 'e'

Expected output:

Counting the matches found for each word should result in:

3 syllables for aerospace

1 syllable for she

A total of 4 syllables.

I was able to construct (?(?=([a-zA-Z]+e))(?=([aeiouy]))) but need some help from you to get it done in a single expression if that’s possible.

Answer

After reading a lot about Regex and the use of Regex conditions. Conditionals are not supported by default by the Java Regex package. (Found the answer here: Conditional Regular Expression in Java?)

So, finally constructed a Regex without if-else-then condition.

([aeiouyAEIOUY]+[^e.s])|([aiouyAEIOUY]+b)|(b[^aeiouy0-9.']+eb)

(https://regex101.com/r/gPO6mP/17)

Improvements are welcome.

Thanks.

Leave a Reply

Your email address will not be published. Required fields are marked *