I am using Numpy’s genfromtxt() function to get large amounts of txt data as array. The data is provided in following format:
2020-05-20 16:54:01.807645 1033.074 2392.555 256.8516 2700.547 1029.691 2108.094 3256.539 90.94727 1775.043 4.770321 48.875
The log file also contains lines like the following:
2020-05-20 17:05:21.864533 DUT stopped
I want to skip those lines. Is there a way of doing so?
My current approach:
values = np.genfromtxt(fi, dtype="S32,f8,f8,f8,f8,f8,f8,f8,f8,f8,f8,f8", missing_values='', delimiter="t", invalid_raise=False, filling_values=0)
Thanks for any help
According to the manual you can also pass in a generator generating bytestrings (lines) to
- file, str, pathlib.Path, list of str, generator
File, filename, list, or generator to read. If the filename extension is gz or bz2, the file is first decompressed. Note that generators must return byte strings. The strings in a list or produced by a generator are treated as lines.
def skip_stopped_lines(fi): for line in fi: if b"DUT stopped" in line: # Don't yield this line, we don't want it continue yield line # NB: not using `with` to avoid the file being closed # while the generators are active fi = open("somefile.txt", "rb") # note binary mode skipper = skip_stopped_lines(fi) values = np.genfromtxt( skipper, dtype="S32,f8,f8,f8,f8,f8,f8,f8,f8,f8,f8,f8", missing_values="", delimiter="t", invalid_raise=False, filling_values=0, )
might do the trick for you.