Python regex OR expression

I have a file named Document.pdf and sometimes it is called Document-12345678.pdf where -12345678 is a random number.

I want to check a file is downloaded in folder. When the file is not finished it display Document.pdf.fkasfmq or Document-12345678.pdf.fkasfmq where .fkasfmq is a random hash from the downloader and I don’t want it to match.

I try make a regex like r'Document(?:[-0-9]+).pdf' and test it with either Document.pdf or Document-12345678.pdf it will always return false.

From my understanding (?:[-0-9]+) means it can be or not in the set that matches any hyphen and any numbers before .pdf, is that correct? I am very very rusty with regex…

Answer

You should mark it as optional with the "?" symbol. Otherwise, you are requiring that the name should have the numbers and/or digits part.

r'Document(?:[-0-9]+)?.pdf'

Or as @anubhava pointed out in the comments, it can be simplified to:

r'Document[-0-9]*.pdf'
  • This way, it will also match e.g. "Document.pdf"

Also, you should consider putting the mark "$" to signify end of string so that it doesn’t match e.g. "Document.pdf.fkasfmq"

r'^Document(?:[-0-9]+)?.pdf$'

Or

r'^Document[-0-9]*.pdf$'