I’m trying to build a python script to calculate averages. To do so, I need to build a vector with 10 columns. Each input comes from a different text file with many text lines, and I need the number from a specific line that looks like this:
BAR: dG = -23.98 kcal/mol
Each file has a different number for this line. How can I get only the number after the string
"BAR: dG = " from these text files and use as an input for a vector like this:
yi = ["number from file 1", "number from file 2" , ... , "number from file 10"]
You can do this easily using regular expressions. The following code does the trick in case your filenames are in a list already.
import re values =  files_list = ['f1.txt','f2.txt','f3.txt','f4.txt','f5.txt','f6.txt','f7.txt','f8.txt','f9.txt','f10.txt'] for f in files_list: fp = open(f,'r') lines = fp.readlines() for line in lines: # This line tries to match your pattern with the current line matches = re.match(r'BAR: dG =s+(-d+.d+)skcal/mol',line) if(matches): # If there is a match, then do something with the value value = matches.group(1) print(value) values.append(float(value)) average = sum(values)/10 print(average)