Use Regex instead of Case When?

It seems like I have this query that takes CPU usage. I wonder if it might be faster to just use regex instead:

GROUP_CONCAT(
    CASE
        WHEN b.column_1 LIKE '%Principal%' THEN CONCAT(a.start_date,  ' ~ ' , a.end_date, ' ', REPLACE(b.column_1, 'School Principal of', 'Principal') , ';', CHAR(10 using utf8) )
        ELSE ''
    END
ORDER BY a.start_date ASC SEPARATOR '') AS roles

The idea is to have an output similar to “2021-01-01 ~ 2022-01-01 Principal” wherein column_1 is like “School Principal of” or “Company Principal of”.

Regex of column_1 is like “^.(principal).$”. Yeah, basically just grabbing “Principal” if it exists and if not then completely remove the whole string.

Any ideas? Thanks!

Answer

Since you know the fixed format of the output you want in the case of a match, I don’t see the need to even do the replacement on column_1. Use this version:

GROUP_CONCAT(
    CASE WHEN b.column_1 LIKE '%Principal%'
         THEN CONCAT(a.start_date,  ' ~ ' , a.end_date, ' Principal')
         ELSE '' END
    ORDER BY a.start_date ASC SEPARATOR '') AS roles

Note that you could use REGEXP here to match Principal, but unless you are really worried about word boundaries, I might stick with the LIKE approach you are currently using.