Sourcefabric Manuals

 English |  Español |  Français |  Italiano |  Português |  Русский |  Shqip

Newscoop 4.2 Cookbook

Configuring Solr for sysadmins

Solr is the open source enterprise search platform from the Apache Lucene project. Its major features include full-text search, hit highlighting, faceted search, near real-time indexing, dynamic clustering, database integration, rich document handling, and geospatial search. There are a few steps required to use Solr on your Newscoop server - you need to install Solr, configure it, and edit your templates...

Solr installation

First, you must install a Java environment. On Debian or Ubuntu GNU/Linux you can do this with the command: 

sudo apt-get install openjdk-6-jre

You can check the installation requirements here: http://wiki.apache.org/solr/SolrInstall and gain a better understanding here: http://wiki.apache.org/solr/SolrJetty. In general, the following steps should be enough to get Solr running.

1. Download Solr from http://lucene.apache.org/solr/downloads.html

2. Unpack it in any directory you want to run it from. We will use /var/www/ as base directory in our example.

$ cp solr-4.1.0.tgz /var/www/
$ cd /var/www/
$ tar -xvzf solr-4.1.0.tgz

3. Copy the Newscoop Solr configuration into the Solr directory.

$ cp -a /var/www/newscoop/example/solr/* /var/www/solr-4.1.0/example/solr/

4. To index languages other than the default of English, edit the file /var/www/solr-4.1.0/example/solr/solr.xml and add a <core> entry for each language you are using (the name of the core must be the ISO two-letter language code). Then copy the en folder for the name of each of those additional cores.

<cores adminPath="/admin/cores" defaultCoreName="en" host="${host:}" hostPort="${jetty.port:}" hostContext="${hostContext:}" zkClientTimeout="${zkClientTimeout:15000}">
    <core name="en" instanceDir="en" />
</cores>

5. Run Solr - change to the example directory and run:

$ java -jar start.jar

Newscoop setup

All you need to do in Newscoop is to enable Solr in the custom application config file. Open (or create) newscoop/application/configs/parameters/custom_parameters.yml with your editor of choice and add this code:

services:
    search_indexer:
        class:      Newscoop\Search\ArticleIndexer
        arguments:  ["@em", "@search.index"]
        tags:
            -  { name: kernel.event_listener, event: article.delete, method: update }

Newscoop looks for Solr by default in http://localhost:8983/solr - if your Solr is running on a different address/port, you can override this default by changing the value of solr_server in the file newscoop/application/configs/parameters/custom_parameters.yml:

parameters:
    search:
        solr_server: "http://:/solr"

This custom_parameters.yml file will override every environment configuration.

Now, we need to store Newscoop content in Solr so that search can start, in other words, we need to populate the Solr index. You can run the following command manually to get some data for testing:

$ cd /var/www/newscoop
$ php application/console index:update --env=prod 100

(where 100 is the number of articles to be indexed; set this value to the number of indexed articles you want).

In production environments you would set up a cron job to run the same command periodically, so that you update your Solr index with any new article or article changes in Newscoop.

That's it - you have Solr runing!

There has been error in communication with Booktype server. Not sure right now where is the problem.

You should refresh this page.