Regex – “The” searching

Say you have a list of movie titles, and you want to either sort them, or search through them, and some of them have “The ” at the start, for example:

  • The Simpsons
  • Simpson Street

When doing a MySQL search:

SELECT * FROM movies WHERE title LIKE "The Simp%";

Would only return the first row. But if you are working in a company where there is no standard set, the movie title could be formatted as “Simpsons, The” – and then, it won’t be found.

To solve this, you could replace the “The ” letters with blank, and then sort out the field contents during the query:

$str_query = preg_replace("/(title like "(the )(.*)%")/i",
    "REPLACE(LOWER(title), "the ", "") LIKE ("$3%")",    
    $str_query);

This will change :

"(title LIKE "The Simpsons")"

to,

"(title LIKE "Simpsons")"

But, the (the) in line 2 tells PHP to only replace it if starts with “The ” (case insenstive).

However, what if you want to search for “the”  (not sure why you would…)

You need to do a negative lookahead, to tell the expression to only carry on, if the search phrase is not exactly “the”

if (preg_match("/(title like "(?!the)(.*)%")/i", $str_query)) {

The (?!the) is the readahead.

(.*) matches any string but it is greedy and you have to be carfeul that it doesn’t just accept everything to the end of $str_query. (but its okay in our case, as we are looking for % (the LIKE wildcard))

After all this, we can run:

SELECT * FROM movies WHERE $str_query;

But what about sorting? All the titles beginning with “The” will appear in the T section. Whereas really, we want the Simpsons to appear in the S section.

Add an easy ORDER BY clause here:

SELECT * FROM movies WHERE $str_query ORDER BY (REPLACE(title, "the ", "") ASC;

Sorted!

10 comments on “Regex – “The” searching

  1. admin June 8, 2010 9:43 am

    There are many. It depends on what level you want. Do you want a free account? Have you got your own web server? Are you confortable with installing apps on a server?

    You stated a CMS, this is more than a blog, this is more of a collabatory website.

    If you just want a free blog, on a free server, wordpress is very popular, then there is google blogger (http://www.blogger.com) – alternatively, search for free blogs on google!

    Also, check to see what sort of domain name you’d get. Would http://www.domain.com/yourname be acceptable to you? if not, you might have to find a paid solution.

    HTH, Chris.

  2. Kelli Jones January 20, 2011 6:40 am

    Oh God, even just finding a simple movie title would give you problems.I’ve tried it out and I couldn’t find it.So frustrating.as is so confusing.how about if looking for a movie that really starts with only “the”? Is it possible to find the right partner of it? will it give us a list of the movie title that starts with “the” only and you don’t know the other remaining words?

  3. miltary February 28, 2011 7:43 am

    correct key word can be possible just try!!!

  4. college February 28, 2011 8:06 am

    i need some details and information in this site.

  5. benefits March 1, 2011 12:42 am

    share me more of your information in this site.

  6. familytr March 1, 2011 1:35 am

    why is it came like this? it’s actually very frustrating to me that i fond this out very hard.

  7. financib March 1, 2011 1:47 am

    i extremely yes,your post but i want more details

Leave a Reply

Your email address will not be published.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>