There are a lot of subtitles in Persian saved with wrong encoding. there are some options in video players to fix & display this files correctly but there is only one windows software which actually fix the file & save it with correct encoding. I want to do this in python. I’ve tried many things but was unable to get this done. notepad says file is in ANSI so I opened it as ‘Latin-1’ in python & tried to decode & encode it as UTF-8 but it gives me the original file. file can be downloaded from
also, fixed file with mentioned software can be downloaded from
how this can be done using python?


Likely the file is encoded in cp1256, aka Windows-1256, the code page used for Persian and Urdu in Windows. To create a UTF-8 version of the file, you’ll just need to read it in this code page and write out in UTf-8:

with open("", "rt", encoding="cp1256") as f:
    data =

with open("", "wt", encoding="utf_8_sig") as f: