SQL Server: Full-Text Searching Across Multiple Tables

Sometimes a full-text search needs to be performed on multiple tables. For example, if we have two tables, Post and Comment, we may want to search for posts that have keywords both in the post text and in the comments.

SQL Server doesn’t allow to include multiple tables in the CONTAINS and FREETEXT predicates. We can of course use several predicates in our SQL query, but this can get pretty complex if we need to search in many columns.

Luckily there’s a workaround. Starting from version 2005, SQL Server allows to add FTS indices to views. Knowing this, we can create a view that combines text columns from both tables and then search on it.

There are a few caveats:

  1. The view must be created with the SCHEMABINDING option.
  2. Tables in the view must be referenced by their schema name, e.g. dbo.Post
  3. Before adding an FTS index, the view must have a unique clustered index. This itself limits the kinds of views that can be indexed. Among other things, the SELECT statement of the view cannot contain: UNION operator, sub-queries, derived tables, OUTER JOINs, some aggregate functions, rowset functions, etc.
  4. The FTS index cannot be added on text, ntext, image and xml columns. Thankfully, it can still be added on varchar(max), nvarchar(max) and varbinary(max) columns, introduced in version 2005.

Here is how our fictitious example can be implemented.

The tables:

CREATE TABLE Post
(
    ID int NOT NULL IDENTITY(1,1) PRIMARY KEY,
    Text nvarchar(max) NOT NULL
)
CREATE TABLE Comment
(
    ID int NOT NULL IDENTITY(1,1) PRIMARY KEY,
    PostID int NOT NULL FOREIGN KEY REFERENCES Post(ID),
    Text nvarchar(max) NOT NULL
)

The view and its unique index:

CREATE VIEW ViewSearch WITH SCHEMABINDING AS
SELECT c.ID, p.Text AS PostText, c.Text AS CommentText
FROM dbo.Comment AS c
JOIN dbo.Post AS p ON p.ID=c.PostID
GO
CREATE UNIQUE CLUSTERED INDEX IX_ViewSearch ON ViewSearch (ID)

After the full-text catalog is created and the index is added on the view, we can use this query:

SELECT DISTINCT p.ID
FROM Post AS p
JOIN Comment AS c ON c.PostID=p.ID
JOIN ViewSearch AS s ON s.ID=c.ID
WHERE CONTAINS(s.*, ‘foo’) AND CONTAINS(s.*, ‘bar’)

Unfortunately a single CONTAINS(s.*, ‘"foo" AND "bar"’) won’t work, SQL Server always uses OR when CONTAINS is applied to multiple columns.

If you have a keen eye you may have noticed a problem in the solution above. It doesn’t work for posts that have no comments at all. Ideally we would need to use a LEFT JOIN in our FTS view, but the SQL Server wouldn’t let us to create a unique index on it.

A possible workaround is to add a full-text index on Post.Text and to rewrite the query like this:

SELECT p.ID
FROM Post AS p
JOIN Comment AS c ON c.PostID=p.ID
JOIN ViewSearch AS s ON s.ID=c.ID
WHERE CONTAINS(s.*, ‘foo’) AND CONTAINS(s.*, ‘bar’)
UNION
SELECT p.ID
FROM Post AS p
WHERE CONTAINS(p.Text, ‘"foo" AND "bar"’)

4 Responses to “SQL Server: Full-Text Searching Across Multiple Tables”

  1. dotnetnoob Says:

    Wow…I didn’t know you could do this, and it really simplifies querying FTcats with multiple tables…thanks it works great!

  2. Alexander Kojevnikov Says:

    You are welcome!

  3. dotnetnoob Says:

    Only problem now is, I can’t update fields in the tables that are used in the view…I get

    “UPDATE failed because the following SET options have incorrect settings: ‘ARITHABORT’. Verify that SET options are correct for use with indexed views and/or indexes on computed columns and/or query notifications and/or xml data type methods. ”

    I tried adding “SET ARITHABORT ON; UPDATE…” but I still get an error.

  4. dotnetnoob Says:

    FWIW, I found out what the problem was…the dB was still in sql2000 compatibility level (80), and when I raised it to 90 everything is working again!

Leave a Reply