python regex: looking for correct pattern with negative lookbehind

I am using python tool which checks git log commit messages to find out if feature with given ID was introduced or reverted. I cannot change the code of the tool. Can only provide proper regex as an input. Input looks like this:

input_regexes = {
    "add_pattern": r".*[s*(IDd{3})s*](.*)"
    "revert_pattern": r"[Rr]evert.*[s*(IDd{3})s*](.*)"
}

First capture group is used to get feature ID and second is used as a feature description. The problem is, when string with [Rr]evert appears, then both patterns match. What I would like to achieve is:

  • revert_pattern pattern matches only commit messages which contain ID in brackets and preceding [Rr]evert
  • add_pattern pattern matches only commit messages which contain ID in brackets and do not contain preceding [Rr]evert

In following example revert_pattern should match only revert_feature_message and add_pattern should match only strings available in add_feature_messages:

revert_feature_message='Revert "[ID123] some cool feature."'
add_feature_messages=[
  '[ID123] some cool feature.',
  'some prefix [ID123] some cool feature'
]

I tried using:

(?<!Revert).*?[s*(IDd{3})s*](.*)

as add_pattern but it didn’t workout. Could you help make it correct?

Answer

The revert pattern [Rr]evert.*[s*(IDd{3})s*](.*) already matches only the revert_feature_message

To match only the strings in add_feature_messages you can assert that the string does not contain revert or Revert.

^(?!.*[Rr]evert).*[s*(IDd{3})s*](.*)

Regex demo

Or a bit more specific:

^(?!.*[Rr]evert [^][]*[s*IDd{3}s*]).*[s*(IDd{3})s*](.*)

Regex demo

If Revert is at the start of the string, you can omit the leading .*