Constructing a full text index

The code to create a new full text index class looks like this:

var index = new FullTextIndex<MyType>(m => m.Text); 
var updatableIndex = new UpdatableFullTextIndex<MyType>(m => m.Text); 
From this you can see that FullTextIndex and UpdatableFullTextIndex are generic types defining the type of item that the index will contain – the constructor describes how the index should retrieve the text for each item that gets indexed.

There are two approaches that you can take to indexing items, indexing keys and indexing .NET objects.

Indexing keys

The first approach is to index a unique key for your items. This means that the index itself will only store a reference or id to the indexed item. This might be an integer, a Guid or other primitive type.
The constructor for this approach looks like this (Note that the FullTextIndex generic type is int):

var index = new FullTextIndex<int>(i => LoadTextForItem(i));
There are a few benefits to this approach:
  • It is much easier to serialize and de-serialize the index
  • Once an item has been indexed the associated text doesn’t need to be retained in memory

Indexing .NET objects

If your items are always loaded into memory and they contain all the text that you want to index, you can index them directly:

var index = new FullTextIndex<Customer>(c => c.CustomerDetails);
Here the generic type of FullTextIndex is Customer.
Obviously the drawback to this approach is that all the instances are retained in memory, along with all their associated properties, including the text.

Last edited Feb 1, 2011 at 12:26 PM by MikeGoatly, version 3

Comments

No comments yet.