How to know text is Arabic or in Urdu Code Answer

Hello Developer, Hope you guys are doing great. Today at Tutorial Guruji Official website, we are sharing the answer of How to know text is Arabic or in Urdu without wasting too much if your time.

The question is published on by Tutorial Guruji team.

I want to know is text contain any letter in Urdu or Arabic..using this condition which produce false results when special characters comes.what is right way to do it .any library or what is right regex for this ?

   if (cap.replaceAll("\s+", "").matches("[A-Za-z]+")
                    || cap.replaceAll("\s+", "").matches("[A-Za-z0-9]+")) {
                Log.d("isUrdu", "false");
                caption.setTypeface(Typeface.DEFAULT);
                caption.setTextSize(16);

            } else {
                Log.d("isUrdu", "True");
             /*   if (Build.VERSION.SDK_INT > Build.VERSION_CODES.JELLY_BEAN_MR1) {*/
                    caption.setTypeface(typeface);
                    caption.setTextSize(20);

         /*       }*/
            }

Answer

Taking a look at the Wikipedia Urdu alphabet, it includes the following Unicode ranges:

U+0600 to U+06FF
U+0750 to U+077F
U+FB50 to U+FDFF
U+FE70 to U+FEFF

To match an Arabic letter, you may use a p{InArabic} Unicode property class.

So, you may use

if (cap.matches("(?s).*[\u0600-\u06FF\u0750-\u077F\uFB50-\uFDFF\uFE70‌​-\uFEFF].*"))
{
    /*There is an Urdu character*/
} 
else if (cap.matches("(?s).*\p{InArabic}.*"))
{  
    /* The string contains an Arabic character */ 
}
else { /*No Arabic nor Urdu chars detected */ }

Note that (?s) enables the DOTALL modifier so that . could match linebreak symbols, too.

For better performance with matches, you may use reverse classes instead of the first .*: "(?s)[^\u0600-\u06FF\u0750-\u077F\uFB50-\uFDFF\uFE70‌​-\uFEFF]*[\u0600-\u06FF\u0750-\u077F\uFB50-\uFDFF\uFE70‌​-\uFEFF].*" and "(?s)\P{InArabic}*\p{InArabic}.*" respectively.

Note you may also use shorter "[\u0600-\u06FF\u0750-\u077F\uFB50-\uFDFF\uFE70‌​-\uFEFF]" and "\p{InArabic}" patterns with Matcher#find().

We are here to answer your question about How to know text is Arabic or in Urdu - If you find the proper solution, please don't forgot to share this with your team members.

Related Posts

Tutorial Guruji