Look for an amount of substring in a file Java

I am looking for an amount of substring in a file In brief, the file contains a certain amount of article, and I need to know how many. Each article starts with: @ARTICLE{ or with @ARTICLE{(series of integer)

Useful infos: – I have 10 files to look in – No files are empty – This code gives me a StringIndexOutOfBounds exception

Here is the code I have so far:

//To read through all files
    for(int i=1; i<=10; i++)
    {
    try
        {       
            //To look through all the bib files
            reader = new Scanner(new FileInputStream("C:/Assg_3-Needed-Files/Latex"+i+".bib"));
            System.out.println("Reading Latex"+i+".bib->");

            //To read through the whole file
            while(reader.hasNextLine())
            {
                String line = reader.nextLine();
                String articles = line.substring(1, 7);

                if(line.equals("ARTICLE"))
                    count+=1;
            }
        }
    catch(FileNotFoundException e)
        {
            System.err.println("Error opening the file Latex"+i+".bib");
        }
    }
    System.out.print("n"+count);

Answer

Try just using String#contains on each line:

while(reader.hasNextLine()) {
    String line = reader.nextLine();
    if (line.contains("ARTICLE")) {
        count += 1;
    }
}

This would at least get around the problem of having to take a substring in the first place. The problem is that while matching lines should not have the out of bounds exception, nor should lines longer than 7 characters which don’t match, lines having fewer than 7 characters would cause a problem.

You could also use a regex pattern to make sure that you match ARTICLE as a standalone word:

while(reader.hasNextLine()) {
    String line = reader.nextLine();
    if (line.matches("\bARTICLE\b")) {
        count += 1;
    }
}

This would ensure that you don’t count a line having something like articles in it, which is not your exact target.

Leave a Reply

Your email address will not be published. Required fields are marked *