Thursday, April 30, 2009

Search Engine Death Match - Solr Wins

It's now been over a month since we rolled out our Lucene / Solr powered search engine replacement for Verity in our application and let me sum it up this way: Zero maintenance.

Even though the OEM Verity licensing states 250K record max, we ran into a ceiling on a few of our machines when the number of Verity collections reached 82-88. We were doing constant rcadmin fixes to drop the thread counts for each collection down to 2 from 3 - but on each restart or at random points in time, the thread counts would return to 3. Instability and random collection switches were the main reasons we moved off the platform, and we'll never look back.

Solr took some configuration, but it was more than flexible enough for our needs, and with the delta updates and deletes, it made the maintenance near zero.

As far as parsing the resulting xml, that was done efficiently through coldfusion's xmlparse and xmlsearch to walk the nodes and return a 'like' verity recordset to the application. With ColdSpring we were able to swap out implementations easily and everything just worked. Great planning and execution resulted in a seamless rollout with Solr being run on RedHat on a dedicated machine.

I will most likely do a presentation at our next/upcoming CFUG (ColdFusion User Group) in Minneapolis/St.Paul in the near months about the transition and talk about integration points before ColdFusion 9 comes out and makes it all available with a single tag ;)

Long Live Open Source software

3 comments:

Aaron & Nicole Longnion said...

We'd be very interested in more detailed blog posts on how you accomplished this. We are feeling the pains of too many collections in Verity, too, and even with 2 Enterprise CF8 licenses, it won't scale much further.

TIA
http://twitter.com/aqlong

philduba said...

I would second Aaron's suggesting as I know I've looked into it but didn't get very far.

Code Fusion, LLC said...

I'll post a more detailed topic next - thanks for the feedback (and stay tuned for more presentation type material)