Cannot load my regular expression from JSON correctly

Case closed: I was loading it correctly after all, but other, sub-optimal behavior in my code caused me to think that I wasn’t.


I’m loading a regex string from JSON:

"version_scheme": {
    "type": "tuple",
    "re": "data-file="kitty_portable-(.*?)\.exe""
}

However, using re.findall(...), I get [] back on a haystack like this (this would be a text attribute of a GET request made using requests module):

<a href="https://www.fosshub.com/KiTTY.html?dwl=kitty_portable-0.74.4.10.exe"
                        data-file="kitty_portable-0.74.4.10.exe"
                        aria-label="Download kitty_portable-0.74.4.10.exe Windows portable"

Answer

The double quotation marks " need to be double- (\") or triple-escaped (\").

The literal dot . needs to be quadruple-escaped (\\.).

Working backwards

The raw string you want:

r'data-file="kitty_portable-(.*?).exe"'

The string you want:

'data-file="kitty_portable-(.*?)\.exe"'

The string you want stored as JSON:

'data-file=\"kitty_portable-(.*?)\\.exe\"'
  • Each of the double quotation marks " needed to be escaped with either a double backslash or a triple backslash.
  • Each of the two backslashes preceding the literal dot needed to be escaped with a single backslash each.