Recently I have to implement prefix searching for a project. The searching need to be scalable and the result are to be returned as quickly as possible. The front end for this search was a Ajax suggestion like search. The screen shot below shows the prefix searching in action.

Ok this pretty simple Ajax stuff, nothing fancy and nothing new. What I am going to discuss is the backend not the frontend for such a prefix searching component.
One of the most common implementation used for prefix searching is:
select * from some_table where some_field like ‘prefix%’
In other word put all the words to be searched in a table and use the like clause. This works and has reasonable performance too for small number of queries. But if you have continuous deletion and insertion of new search words then this solution does not scale well for large number of users.
So let put these words in memory for faster response time. How do we search prefixes if we have all the words in the memory?
Put them in a list and do a sequential search? Ok you guess that too slow….. Put them in a hash table? What key we are going to search in the hash table we have only part of the key (prefix)!!
You must agree prefix searching is a specialized problem which requires a special data structure that is geared toward prefix searching. You have read the title so you know the name of that data structure. Yes Radix tree is one of such data structure that is suitable for prefix searching. A radix tree, Patricia trie/tree, or crit bit tree is a specialized set data structure based on the trie that is used to store a set of strings.
Trie is a multi-way tree structure useful for storing strings over an alphabet. The idea is that all strings sharing a common stem or prefix hang off a common node. The elements in a string can be recovered in a scan from the root to the leaf that ends a string. All strings in the trie can be recovered by a depth-first scan of the tree.
Below is a trie for keys “to”, “tea”, “ten”, “i”, “in”, and “inn”.

You can consider a Radix tree a condense form of Trie, in which intermediate nodes are merged together to save space. Below is shown the insertion process in a Radix tree.

I wrote a simple Java implementation for radix tree for the prefix search application mentioned above. You can find the code at http://code.google.com/p/radixtree/
I like tree structures because of their recursive nature, looks really natural to me
Most of the operation on Radix tree like insert, delete, find, etc can be easily implemented a recursive manner. Apart from the insert method another interesting method in implementation is visit() method it is there to help with other operation like delete, find and contains
/**
* recursively visit the tree based on the supplied "key". calls the Visitor
* for the node those key metches the given prefix
*
* @param prefix
* The key o prefix to search in the tree
* @param visitor
* The Visitor that will be called if a node with "key" as its
* key is found
* @param node
* The Node from where onward to search
*/
private void visit(String prefix, Visitor visitor,
RadixTreeNode parent, RadixTreeNode node) {
int i = 0;
int keylen = prefix.length();
int nodelen = node.getKey().length();
// match the prefix with node key
while (i < keylen && i < nodelen) {
if (prefix.charAt(i) != node.getKey().charAt(i)) {
break;
}
i++;
}
// if the node key and prefix match, we found a match!
if (i == keylen && i == nodelen) {
visitor.visit(prefix, parent, node);
} else if (node.getKey().equals("") == true // either we are at the
// root
|| (i = nodelen)) { // OR we need to
// traverse the childern
String newText = prefix.substring(i, keylen);
for (RadixTreeNode child : node.getChildern()) {
// recursively search the child nodes
if (child.getKey().startsWith(newText.charAt(0) + "")) {
visit(newText, visitor, node, child);
break;
}
}
}
}
Off course you can have better look at the implementation by visiting the online code repository at http://radixtree.googlecode.com/svn/trunk/RadixTree/
Update: I have remove the link to the old download distribution, Please download the latest code from http://code.google.com/p/radixtree/ As there are some bug fixes and improvements.
Tags: code, Data Structure, Java, Tree
January 29, 2008 at 5:46 pm |
What about using LINQ into objects ?
January 29, 2008 at 6:57 pm |
Yes, LINQ can be good M$ .Net world alternative to SQL “like” clause for objects in memory.
Radix tree are type of Tire which specialized tree for variety of purposes, One example i can think of quickly is “spell suggestions”
March 24, 2008 at 8:53 am |
thats it, man
April 28, 2008 at 8:47 pm |
I tried to create a tree of strings but failed.
can someone provide any info on how to go about it?
thanks.
April 29, 2008 at 12:24 pm |
Jack,
Please download latest code from http://code.google.com/p/radixtree/
There are bugs fixes and improvements. Let me know if you are still having problem. Even better post an issue here http://code.google.com/p/radixtree/issues/list
April 30, 2008 at 12:51 am |
bootstrapping, thanks for the link.
i am already using the google code.
my prob is i think i m not doin it right.
i created a main in the radixtreeimpl class and created a tree as:
RadTreeImpl tree = new RadTreeImpl();
tree.insert(“a”, “apple”);
tree.insert(“a”, “ape”);
and so on
but i cant display the tree on the console.
please help.
April 30, 2008 at 6:58 am |
You are adding duplicate keys. The first parameter to insert() method is the key and it should be unique (In the future it will support duplicate keys). So you need to do something like this.
RadixTreeImpl tree = new RadixTreeImpl();
tree.insert("apple", "apple");
tree.insert("ape", "ape");
tree.display();
Remember the display() function is for testing purpose only and should not be used otherwise.
Thanks,
TH
December 1, 2008 at 10:03 am |
Thanks for the code, is exactly what I was searching about
February 20, 2009 at 8:35 pm |
Your animated insertion graphic has an error. When apartheid is added, the apartments node (artments) should change to (mentsheid). Otherwise, great tutorial!
February 20, 2009 at 10:35 pm |
Sorry, forgot to do my HTML escaping (and there is no preview). Again:
When “apartheid” is added, the apartments tree starting at node “artments” should change to “ments”<-”art”->”heid”.
March 3, 2009 at 1:44 pm |
Hello webmaster
I would like to share with you a link to your site
write me here preonrelt@mail.ru
April 15, 2009 at 12:11 pm |
My fellow on Facebook shared this link with me and I’m not dissapointed at all that I came here.
June 14, 2009 at 1:00 am |
I’d like to congratulate for the good Java implementation that you created for RadixTrees.
I’m really using your code, but an approach that we still can’t reach with it, it’s to use incremental search. I though in some ideas, like a searchPrefix return a RadixTreeNode or a RadixTree, thus it could be make a searchPrefix over a searchPrefix. Therefore, associating with a high level implantation we could have: once the user tips a key, this implementation would give the data associate with that key, after the user tips another key, a search would be performed over the RadixTree returned in the last search.
I don’t know if I made my point clear enough.
But maybe I’d like to implement it on next weeks.
Let’s keep contact.
Thanks.