Learned: The mystery/misery of MySQL Full-Text Search

December 04, 2007 · 2 comments

I have been relying on MySQL’s Full-Text search index for sometime.

Recently a user ran into a black hole in that feature.

He wanted to find records containing a simple phrase, “thank you”.

But, the search resulted in no records found.

You see, “thank” and “you” are stop words in the default full-text index. Meaning they are discarded or given no relevance in the search results.

After some digging, I discovered that others have run into this same limitation and opted to remove the default stop word list.

Quick fix? Not quite.

Removing the stop word list, forces you to rebuild the index. Which on my modestly sized database meant 30 minutes of downtime on that table.

Fair enough. I performed that over the weekend.

Fast forward a couple days. Guess what happens now?

Other users are getting less than stellar search results - of course. Seemed a fair trade off at the time, between the two extremes.

I could keep tuning the stop words and narrowing the gap between the two extremes, but tuning mysql full-text index takes far too long because of the table locking required to rebuild the index.

So I am moving on to external search indexers, like Sphinx. Which can rebuild the entire index in a few minutes without requiring the downtime for the users.

2 responses so far ↓

  • 1 Warren Smith // Apr 03, 2008 at 10:34 PM

    These are not limits of mySQL's full text searching, they are limits on the implementation in your application. You should be filtering the search words anyway before handing them to mySQL anyway, or else what if I am searching for some sort of hyphenated phrase? mySQL will assume the phrase "some-word" is searching for "some" but not "word", there are countless more examples. It is a tool you lean on, not a black box searching solution, or else we would all be linking mySQL databases and all have a mini Google. Apply basic filtering/manipulations to the search terms, sort by relevance and don't forget that a lot of this behaviour can be easily altered with mySQL based configuration changes.
  • 2 Lon // May 19, 2008 at 08:19 PM

    Thanks warren. I agree, but also feel that moving search out of MySQL into a more specialized system has some benefits. That said, I have been playing with the idea of a using a dedicated MySQL server that is optimized for fulltext search as the data set grows. A replicated read-only slave specifically tuned for searching.

Leave a Comment