Power up Drupal search with Jetty & Apache Solr in Ubuntu 10.04

If you are running a big site with loads of visitors or just running a normal site with very minimal resources your basic Drupal search functionality might slow down your site significantly. And it's not even very efficient solution to search all the contents of your site. Don't worry, the help is here. Apache Solr helps you with your site's search functionality and makes it blazingly fast! In this first part of Apache Solr walk through we are going to install and configure Apache Solr + Jetty with Drupal.

From http://lucene.apache.org/solr/

"Solr is the popular, blazing fast open source enterprise search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted search, dynamic clustering, database integration, and rich document (e.g., Word, PDF) handling. Solr is highly scalable, providing distributed search and index replication, and it powers the search and navigation features of many of the world's largest internet sites."

You can use Solr within a servlet container such as Tomcat or Jetty to power up your Drupal site. However in many cases installation of Tomcat server is too heavy a solution. In this article we are covering the installation of Apache Solr on Jetty. Jetty is a pure Java-based HTTP server and servlet container and runs with a smaller foot print than Tomcat. We are installing Jetty and Apache Solr in to a Ubuntu 10.04 server environment.

First, we need to install java so we can run Jetty and Apache Solr:

sudo apt-get install openjdk-6-jdk

Next, we need to fetch the latest version of Apache Solr. Ubuntu 10.04 has a solr-jetty package in repository but it's version 1.4.0 and 1.4.1 is the latest. So we are going to fetch the original gzipped package and extract it:

cd /tmp
wget ftp://ftp.funet.fi/pub/mirrors/apache.org//lucene/solr/1.4.1/apache-solr-1.4.1.tgz
tar -zxvf apache-solr-1.4.1.tgz

Let's move Solr to a more sane location:

sudo mv apache-solr-1.4.1 /usr/local/share/solr

Make backup of solrconfig.xml and schema.xml files because we are going to replace those:

cd /usr/local/share/solr/example/solr/conf
sudo mv solrconfig.xml solrconfig.bak && mv schema.xml schema.bak

Install Apache Solr (apachesolr) Drupal module:

cd /var/www/drupal/sites/all/modules
drush dl apachesolr

Install needed SolrPhpClient library in to sites/all/modules/apachesolr. We need the revision 22 to get Solr work with Drupal:

cd /var/www/drupal/sites/all/modules/apachesolr
svn checkout -r22 http://solr-php-client.googlecode.com/svn/trunk/ SolrPhpClient

Copy solrconfig.xml and schema.xml from apachesolr module folder to /usr/local/share/solr/example/solr/conf

sudo cp solrconfig.xml schema.xml /usr/local/share/solr/example/solr/conf/.

Start Jetty servlet container:

cd /usr/local/share/solr/example
sudo java -jar start.jar

Test installation. Solr should be reachable from http://localhost:8983/solr/admin/

Make Solr start automatically at boot time. Insert following in to /etc/init.d/solr:

#!/bin/sh -e
# Starts, stops, and restarts solr 

JAVA_OPTIONS="-Xmx1024m -DSTOP.PORT=8079 -DSTOP.KEY=stopkey -jar start.jar"

case $1 in
        echo "Starting Solr" 
        cd $SOLR_DIR
        echo "Stopping Solr" 
        cd $SOLR_DIR
        $JAVA $JAVA_OPTIONS --stop
        $0 stop
        sleep 1
        $0 start
        echo "Usage: $0 {start|stop|restart}" >&2
        exit 1

Make the file executable and add it to default runlevel:

sudo chmod a+x /etc/init.d/solr
sudo update-rc.d solr defaults

Now you just need to enable module in Drupal and check that apachesolr module has connection to your Solr installation.

You can find more documentation and instructions from Apache Solr modules project page. You should also check useful modules like Apache Solr autocomplete and Apache Solr Multilingual that extend the functionality of Apache Solr module. In the second part of Apache Solr walk through we are going to handle MultiCore installation and configuration of Apache Solr. So stay tuned!


Add new comment

Filtered HTML

  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd> <pre>
  • Lines and paragraphs break automatically.
  • Web page addresses and e-mail addresses turn into links automatically.

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.