Regular Expression to exclude a certain pattern

I am trying to find the count of number of occurrences of a substring in a string using

function countOccurences(str,word){
   var regex = new RegExp("\b"+word+"\b","gi");
    console.log((str.match(regex)|| []).length);
}


let String=' test TEST TESTING Test I like testing <h3>TEST</h3> class="test id="test" ';

let asset="Test";
countOccurences(String,asset);

Here, the result I am getting is 6, which is ok as I want the exact match, but I want to exclude the test of class and id, so that the result I get is 4 and not 6.

Answer

You can form a regex that will match and capture the substrings you would like to skip when counting and match the asset in other contexts only.

The sample regex may look like

/b((?:id|class)="Testb)|bTestb/gi

This will match

  • b((?:id|class)="Testb) – word boundary, Group 1 capturing id or class, then ="Test as a whole word
  • | – or
  • bTestb – whole word Test

See the JavaScript demo:

function countOccurences(str,exceptions,word){
   const pattern = "\b((?:" + exceptions.map(x => x.replace(/[-/\^$*+?.()|[]{}]/g, '$&')).join("|") + ')="' + word + "\b)|\b"+word+"\b";
   const regex = new RegExp(pattern,"gi");
   let count = 0, m;
   while (m = regex.exec(str)) {
       if (!m[1]) {
           count++;
       }
   }
   console.log(count)
}


let text = ' test TEST TESTING Test I like testing <h3>TEST</h3> class="test id="test" ';

let asset="Test";
let exception_arr = ["id", "class"]
countOccurences(text,exception_arr,asset);
// => 4

Another solution is based on the negative lookarounds (not support in all JavaScript environments yet):

function countOccurences(str,exceptions,word){
   const pattern = "\b(?<!\b(?:" + exceptions.map(x => x.replace(/[-/\^$*+?.()|[]{}]/g, '$&')).join("|") + ')=")'+word+"\b";
   const regex = new RegExp(pattern,"gi");
   console.log((str.match(regex) || ['']).length);
}

let text = ' test TEST TESTING Test I like testing <h3>TEST</h3> class="test id="test" ';

let asset="Test";
let exception_arr = ["id", "class"]
countOccurences(text,exception_arr,asset);

Here, /b(?<!b(?:id|class)=")Testb/gi regex will match any Test as a whole word if it is not immediately preceded iwth id or class as whole words followed with =" substring.