Posted on 31-08-2009
Filed Under (mysql, php) by Elfin Ruler

Getting Started With MySQL’s Full-Text Search Capabilities – What is Full-Text Searching?
(Page 2 of 6 )

Imagine having a database that contained 10,000 tables. In each of these tables there are 1,000 rows with 100 fields. How would you effectively search this sort of information structure without killing your web server? The answer is MySQL’s full-text search capabilities.

A full-text search makes use of indexes, which you can define against a table either when it is created or by using MySQL’s ALTER TABLE command. These indexes are setup on specific fields of a table and change the way that MySQL stores the records for that particular table.

For example, let’s say that you have a table to store the title, price, description, availability and picture of some computer products. When users perform a search, they will most commonly enter part of the products title or description.

If we setup an index on the title and description fields in our MySQL database, then we can make MySQL store the records for the products table in an indexed format, meaning that when a user performs a search, MySQL can retrieve related records quicker, because it has already indexed (or ordered) them.

Full-text searches are faster than other search methods such as wildcard or character based searches, which are commonly performed using MySQL’s LIKE command.

Besides being significantly quicker than normal character based searches, why would you want to use full-text searching? Here’s a quick list of points to wet your appetite:

  • Full-text searching is ideal for extremely large databases that contain thousands or even millions of rows. Computations are performed faster and rows can be ranked based on search relevance, which is returned as a decimal number by MySQL.
  • Noise words and any words that are 3 characters or less in length such as the, and, etc are removed from the search query. This means that more accurate results are returned. If you searched for “as the people”, then the noise words “the” and “as” will automatically be removed from your query.
  • In addition to simple searches, full-text searches can also be performed in Boolean mode. Boolean mode allows searches based on and/or criteria, such as “+person +Mitchell”, which would only return all records that contained the words person AND Mitchell. We will look at Boolean searches later in this article.
  • The query is case-insensitive, meaning that “cat” is ranked the same as “Cat”, “CAT” and “cAT”.

As you can see, full-text searching is fast, powerful and smart. It eliminates the need for us to write complicated search and Boolean algorithms, and you can be up and running with MySQL’s full-text searching and indexing capabilities in under 5 minutes.

Your First Full-Text Search
Let’s jump right in and see full-text searching in action. First off we need to create our database, so fire up the MySQL console window and create a new database called testDB, like this:

create database testDB:
use testDB;

Next, we need to create a table using MySQL’s FULLTEXT command, specifying which fields we want to index for searching:

create table testTable
(
pk_tId int auto_increment not null,
firstName varchar(20),
lastName varchar(20),
age int,
details text,
primary key(pk_tId),
unique id(pk_tId),
fulltext(firstName, lastName, details)
);

What we’ve done here is create a basic table that will store the details of some people. The first field is a uniquely identified primary key and the rest are simple varchar, int and text fields.

If you already have a table setup and want to index existing fields, use the ALTER TABLE command like this: ALTER TABLE myTable ADD FULLTEXT(field1, field2);

Take a look at the highlighted line in our CREATE TABLE command, shown above:

fulltext(firstName, lastName, details)

This line tells MySQL to set an index on the firstName, lastName and details fields of our table. Indexes can only be created on fields of type VARCHAR and TEXT. Because these fields contain indexes, we can now use the power of a MySQL full-text search to find records in this table based on the values in these 3 fields.

Before we can perform a full-text search however, we need to add some records to our table with the following MySQL commands:

insert into testTable values(0, ‘Mitchell’, ‘Harper’, 20, ‘Mitchell is the founder and manager of devArticles and various other sites across the SiteCubed network’);

insert into testTable values(0, ‘Ben’, ‘Rowe’, 19, ‘Ben is our Flash guru. He is currently writing a series of Flash articles that show beginners how to use Flash to create dynamic content with PHP’);

insert into testTable values(0, ‘Havard’, ‘Lindset’, 19, ‘Havard is a PHP and MySQL power user’);

insert into testTable values(0, ‘Michael’, ‘Manatissian’, 25, ‘Michael is the senior devArticles editor and is a capable Linux/Apache administrator’);

insert into testTable values(0, ‘Sandra’, ‘Lee’, 23, ‘Sandra is our finance and accounting wizard, having 4 years of experience under her belt’);

Because full-text searching was designed for larger databases, it is possible for MySQL to return incorrect results when it’s used on tables containing smaller amount records, that’s why I’ve created 5 “chunky” records for our test table.

Now that we have records in our table, how do we search them? Simple -– we create a query to invoke MySQL’s full-text search feature, like this:

select firstName from testTable where match(firstName, lastName, details) against(‘guru’);

Getting Started With MySQL’s Full-Text Search Capabilities – What is Full-Text Searching (contd.)

(Page 3 of 6 )

This query returns the following result:

Let’s analyze our query. First up, we have the SELECT and FROM parts of our query:

select firstName from testTable

There’s nothing unusual about this part of the query: it simply tells MySQL to retrieve the firstName field from the table called testTable. Next up, we have the WHERE clause:

where match(firstName, lastName, details) against(‘guru’);

This is where the power of full-text searching starts. In the first part of the query we call the MATCH command, which tells MySQL to match against the values of the firstName, lastName and details fields when it performs a full-text, natural language search.

When the MATCH command is used as part of the SELECT clause it returns a relevance ranking, which is a positive decimal number. The closer to 0 this number is, the less relevant the record is. This relevance value is determined based on the search expression, the number of words in the indexed fields, as well as the total number of records being searched.

Lastly, we have the AGAINST command. AGAINST is simple enough and accepts just one parameter, which is the string that we’re searching for. Later in this article we will see how to perform boolean searches using the AGAINST command in combination with the IN BOOLEAN MODE keywords.

So far so good, right? But how does a full-text search differ from using the LIKE command in a query like this:

select firstName from testTable where details like ‘%guru%’;

… they both return the same result don’t they? Well yes and no. Let’s now see how we can determine a relevance ranking for each of the records returned from a MySQL full-text search.

Determining A Relevance Ranking
Still working with our test table, take the following query into consideration:

select concat(firstName, ‘ ‘, lastName) as name, match(firstName, lastName, details) against(‘devArticles’) as relevance from testTable where match(firstName, lastName, details) against(‘devArticles’);

This query produces the following results:

In this query I’ve included the MATCH command in the SELECT clause to return a relevance ranking for each record. The query performs a full-text search against the firstName, lastName and details fields for the string “DevArticles”:

where match(firstName, lastName, details) against(‘devArticles’);

The query has returned two records, which contained the string “devArticles” in either their firstName, lastName or details fields. Looking back at our records, we can see that the records for Mitchell and Michael contained the string “DevArticles” in their details fields:

‘Mitchell is the founder and manager of devArticles and various other sites across the SiteCubed network’

‘Michael is the senior devArticles editor and is a capable Linux/Apache administrator’

The relevance field returned from our query was generated with the following expression:

match(firstName, lastName, details) against(‘devArticles’) as relevance

Looking carefully at the query, we can see that this expression has been used twice: once in the SELECT clause and once in the WHERE clause. MySQL picks up on this and only performs one full-text search on the table and not two, meaning that there is no additional overhead produced for a query like this –- a huge bonus and time saver if a similar query was performed on a table with 5 million rows.

When the MATCH command is used in the WHERE clause, MySQL automatically sorts the rows from highest to lowest relevance. In our previous example query we only returned records that actually had a match against the string “devArticles” when a full-text search was conducted. Here’s a query that returns the relevance ranking for every record in our table:

select concat(firstName, ‘ ‘, lastName) as name, match(firstName, lastName, details) against(‘devArticles’) as relevance from testTable;

We’ve left out the WHERE clause, so the resultant records are unordered, as we can see below:

(0) Comments    Read More   
BACK-LINKS and PAGE-RANK WEB DIRECTORY

Free targeted website traffic - www.ad-traffic.net