So I was having a little trouble getting full text search to work with the GUI in SQL Server Express with Advanced Services (formerly SQL Server 2005 Express SP1), so I had to do things manually. It was probably a permissions or setup issue with SQL Server Expres or the tools. In addition to setting up FTS, I wanted a search query to weight columns differently in the search rankings — something that SQL Server FTS doesn’t really support.
Setting up Full Text Search
First I had to download and install SQL Server Express with Advanced Services. It’s big, but comes with the goodies I wanted.
Then I connected to my SQL Server Express database using SQL Server Management Studio so I could type in some queries. If your SQL Server Express database is in your Visual Studio Project’s App_Data folder, you may be out of luck — I wasn’t able to get full text search to work on those, although maybe adjusting permissions would do it.
Once connected to the database, I created a full text catalog
CREATE FULLTEXT CATALOG MyFTCatalog
Next I needed to get the name of a unique index for my table. You can only create full-text indexes on tables with a single-key unique index (e.g. an autonumber primary key index). Remember that your unique index doesn’t have to be on the columns that you want to perform full text searches on.
I had a table called Listing a primary key of IdListing and three varchar fields I wanted to search on: Address, Realtor, and Notes. My table already had a unique index called PK_Listing_IdListing, so it was time to create a full-text index on the three columns I wanted to be able to search on:
CREATE FULLTEXT INDEX ON Listing (Address, Realtor, Notes)
KEY INDEX PK_Listing_IdListing
ON MyFTCatalog
WITH CHANGE_TRACKING AUTO
What the above query did is create a full-text index on those three Listing table columns and store it in the full-text catalog named MyFTCatalog. I indicated PK_Listing_IdListing as the index to help uniquely identify rows on the Listing table, and I told the Full Text Search engine to automatically update the full-text catalog if values in the table change.
Lastly I did a quick check to confirm the catalog existed and wasn’t still building
SELECT FULLTEXTCATALOGPROPERTY(‘MyFTCatalog’, ‘Populatestatus’)
And we’re set up. Now it was time to query. And man is it hot in here. I guess overclocking your PC makes for a sweaty summer. Anyhow…moving on.
Performing Weighted Queries
There are plenty of pages about performing full-text queries in SQL Server. Here’s a place to start.
So my first query looked like this
SELECT IdListing, Address, Realtor, Notes
FROM Listing
WHERE FREETEXT(*,‘some keywords’)
The * tells FTS to perform the search on all columns in the full-text index. But the query wasn’t going to work for me, since it doesn’t give more weight to one column over the other. Plus, in order to sort results by ranking, I needed to use the *TABLE full-text queries. I’m partial to FREETEXTTABLE because it already does all the stemming/etc for me.
Then I did a UNION query like this
SELECT TOP 100 Rank, Address, Realtor, Notes
FROM
(
SELECT f.Rank, l.Address, l.Realtor, l.Notes
FROM listing l INNER JOIN
FREETEXTTABLE(listing, Address, ‘some keywords’) as f
ON l.idListing = f.[KEY] UNION
SELECT f.Rank, l.Address, l.Realtor, l.Notes
FROM listing l INNER JOIN
FREETEXTTABLE(listing, Realtor, ‘some keywords’) as f
ON l.idListing = f.[KEY] UNION
SELECT f.Rank, l.Address, l.Realtor, l.Notes
FROM listing l INNER JOIN
FREETEXTTABLE(listing, Notes, ‘some keywords’) as f
ON l.idListing = f.[KEY]) as myTable
ORDER BY Rank DESC
which I quickly rewrote to
SELECT TOP 100 f.Rank, l.Address, l.Realtor, l.Notes
FROM Listing l INNER JOIN
(
SELECT Rank, [KEY] from FREETEXTTABLE(listing, Address, ‘some keywords’)
UNION
select Rank, [KEY] from FREETEXTTABLE(listing, Realtor, ‘some keywords’)
UNION
select Rank, [KEY] from FREETEXTTABLE(listing, Notes, ‘some keywords’)
) as f
ON l.IdListing = f.[KEY]ORDER BY f.Rank DESC
and then added some weights to the rankings, like so.
SELECT TOP 100 f.WeightedRank, l.Address, l.Realtor, l.Notes
FROM listing l INNER JOIN
(
SELECT Rank * 5.0 as WeightedRank, [KEY] from FREETEXTTABLE(listing, Address, ‘some keywords’)
UNION
select Rank * 3.0 as WeightedRank, [KEY] from FREETEXTTABLE(listing, Realtor, ‘some keywords’)
UNION
select Rank * 1.0 as WeightedRank, [KEY] from FREETEXTTABLE(listing, Notes, ‘some keywords’)
) as f
ON l.idListing = f.[KEY]ORDER BY f.WeightedRank DESC
Pretty good. You have the column weighting, and you could wrap it up in a nice little stored procedure and be good to go.
However, there was one last thing I needed. I really wanted a query that would to combine column rankings, so that if there were hits in multiple columns, the rank would be higher than a hit in a single column. So this is what I came up with.
SELECT TOP 100 f.WeightedRank, l.Address, l.Realtor, l.Notes
FROM listing l INNER JOIN
(
SELECT [KEY], SUM(Rank) AS WeightedRank
FROM
(
SELECT Rank * 5.0 as Rank, [KEY] from FREETEXTTABLE(listing, Address, ‘some keywords’)
UNION
select Rank * 3.0 as Rank, [KEY] from FREETEXTTABLE(listing, Realtor, ‘some keywords’)
UNION
select Rank * 1.0 as Rank, [KEY] from FREETEXTTABLE(listing, Notes, ‘some keywords’)
) as x
GROUP BY [KEY]) as f
ON l.idListing = f.[KEY]ORDER BY f.WeightedRank DESC
Notice how I’m grouping the inner UNION query by [KEY] (in this case, Listing.IdListing) and SUMming the weighted ranks. That allows us to push results with hits in multiple columns higher up in the search rankings. Obviously it’s not going to perform as well as a simpler query, but the ranking was important for this project.
Conclusion
So, there ya go. Installing SQL Server Express isn’t too bad, although it’s a big download. Setting up Full Text Search seemed to work best for me from the command line. And, now you have a way to rank matches with different columns having different weights.
Update: An Alternate Approach
Hilary Cotter (SQL MVP & FTS guru) provided an alternate query. I did a few tests & both seemed comparable in performance, although I didn’t test using very large data sets. I made a slight change to his query and added a WHERE clause so that only matches are returned.
select TOP 100
idListing, Address, Realtor, Notes,
RankTotal=isnull(RankAddress,0)+isnull(RankRealtor,0)+isnull(RankNotes,0)
from listing
left join (SELECT Rank * 5.0 as RankAddress, [KEY] from
FREETEXTTABLE(listing, Address, ‘Street’)) as k
on k.[key]=Listing.idListing
left join (select Rank * 3.0 as RankRealtor, [KEY] from
FREETEXTTABLE(listing, Realtor, ‘Street’)) as l
on l.[key]=Listing.idListing
left join (select Rank * 1.0 as RankNotes, [KEY] from
FREETEXTTABLE(listing, Notes, ‘Street’)) as m
on m.[key]=Listing.idListing
WHERE RankAddress IS NOT NULL OR RankRealtor IS NOT NULL OR RankNotes IS NOT NULL
ORDER BY RankTotal DESC
Hilary also provided a script (run it in Query Analyzer or in a Query Tab in SQL Mgmt Studio) to set up a test database so you can try the query out yourself. I modified it to seed the test database with a bunch of records (since with only a few records, even LIKE is faster that FTS):
create database realtor
go
use realtor
GO
sp_fulltext_database ‘enable’
GO
Create fulltext catalog realtor as default
GO
create table Listing(
idListing int not null identity constraint ListingPK primary key,
Address varchar(200), Realtor varchar(200), Notes varchar(200))
GO
— add initial seed records
insert into Listing(Address, Realtor, Notes)
values(‘123 Any Street’,‘John Street’,‘the word on the street is good’)
insert into Listing(Address, Realtor, Notes)
values(‘123 Any Road’,‘John Street’,‘the word of mouth is good’)
insert into Listing(Address, Realtor, Notes)
values(‘123 Any Road’,‘John Smith’,‘the word on the street is good’)
insert into Listing(Address, Realtor, Notes)
values(‘123 Any Street’,‘John Smith’,‘the word of mouth is good’)
GO
— multiply seed records, get up over 1M rows
— might take a while
PRINT ‘Please wait a few minutes while the database is seeded’
DECLARE @i int
SET @i = 0
WHILE (@i < 18)
BEGIN
insert into Listing(Address, Realtor, Notes)
select TOP 10 Address, Realtor, Notes from Listing
SET @i = @i + 1
PRINT convert(varchar,@i)
END
PRINT ‘Database has been seeded’
GO
PRINT ‘Please wait a few minutes while the fulltext index is built’
GO
create fulltext index on listing(Address, Realtor, Notes)
key index ListingPK
GO
— check the below query. When it returns zero, the FT index is done building.
SELECT FULLTEXTCATALOGPROPERTY(‘realtor’, ‘Populatestatus’)
GO