Find Total Number of Repetetions of numbers in a file

I have a file with a string Global=x , where x is a number in between lines of text. I want to calculate the total number of repetitions of the number ‘x’ extracted from the string “Global=x”. I don’t want the number of occurrences of each ‘x’ printed.

For example, if the input file is like

Global=33333
Global=33333
Global=33334
Global=33335
Global=33336
Global=33337
Global=33337
Global=33337

the output should be 2, as two numbers ‘33333’ and ‘33337’ are repeated (it does not matter how many times they are repeated).

I tried

grep -Po '(Global)=Kd+' file.dat | sort | uniq -c

but I get the frequency of occurrence of each number, which I don’t need:

2 33333
1 33334
1 33335
1 33336
3 33337

Any help will be appreciated, gre, awk and sed solutions are acceptable.

Answer

You could change uniq -c to uniq -d:

$ grep -Po '(Global)=Kd+' file.dat | sort | uniq -d
33333
33337

-d prints only duplicated lines. A further pipe to wc -l could count those lines. Also note that both -P & -o options to grep are non-standard, so will not be available in every version of grep.

Leave a Reply

Your email address will not be published. Required fields are marked *