Sunday, July 19, 2015

Apache Solr: Installation and Creating a Core

On this post, I will briefly show how to install Apache Solr.  Solr 4.10.2 is used for this post and I assume you have a Java installed on your machine.

Note: At the first time, I used a Solr 5.2.1 and found that a dynamic schema along with a mechanism for handling unknown fields are used on a created solrconfig.xml.   To leverage my existing knowledge, I decided to use a version 4.10.2.

I. Installation.
1.  Download Apache Solr and unzip.  This is my initial directory structure of the Solr

2. Create a core for this address search.

     $ cd example/solr
     $ mkdir addressDB
     $ mkdir addressDB/conf

Now, create a create a core.properties under the addressDB directory. This file will have a single line shown below.  The 'addressDB' is a name of a core we will use.

    name=addressDB

3.  Create necessary files under addressDB/conf directory.  At this point, solrconfig.xml is a required file and it can be copied from $SOLR_HOME/example/solr/collection1/conf/solrconfig.xml

In the solrconfig.xml, there is a configuration for Query Elevation Component.  We will comment out the following elements from the file at this point.

  <searchComponent name="elevator" class="solr.QueryElevationComponent" >
    <!-- pick a fieldType to analyze queries -->
    <str name="queryFieldType">string</str>
    <str name="config-file">elevate.xml</str>
  </searchComponent>

  <!-- A request handler for demonstrating the elevator component -->
  <requestHandler name="/elevate" class="solr.SearchHandler" startup="lazy">
    <lst name="defaults">
      <str name="echoParams">explicit</str>
      <str name="df">text</str>
    </lst>
    <arr name="last-components">
      <str>elevator</str>
    </arr>
  </requestHandler>

Also, copy all 'admin-*.html' files from $SOLR_HOME/example/solr/collection1/conf to $SOLR_HOME/example/solr/addressDB/conf
     
After this, this is a directory structure of the addressDB directory.
 


4.  Run the following command to start the Solr.  Default port number is 8983.
    $ bin/solr start

You should be able to open the Solr admin web UI on localhost:8983/solr

Your solr server started correctly, but your addressDB core has an initialization error because of non-existence of a schema.xml file.  This file will be created on a next post.

4. Run the following command to stop the server.
    $ bin/solr stop -all

5. By default, the command 'bin/solr start' uses a max 512M heap memory.  To increase the heap memory, you may specify the size.  ex:  $ bin/solr start -m 1024M

6.  There is an example directory named 'collection1' under the $SOLR_HOME/example/solr directory.  At this point, just rename a file core.properties inside the 'collection1' to 'core.propertiesTemp'.  Otherwise, the configuration on my later posts "may" conflict with this collection1, but it is not required.


II.  Files You Should Know

These files are located at this repository.
1.  solr.xml
This file is located at a $SOLR_HOME/example/solr directory.  This file defines properties related to host, logging, sharding, and solrcloud.  You need to open this file and may see it.  We will not change this file at this point.

2.  solrconfig.xml
This file is located at a $SOLR_HOME/example/solr/addressDB/conf directory.

This file contains lots of configuration data, which are very important.  You should open this file and look at the contents of this file at least.  Generated file contains well documented description for each configuration.  This file name can be changed in a core.roperties, but we will keep the default name, which is a solrconfig.xml

3.  core.properties
This file is located at a $SOLR_HOME/example/solr/addressDB directory.  This file can have several properties, but the one required property is a 'name' property.  In this file, you may have at least one line shown below.

     name=addressDB

'addressDB' is a name of the core you will use.

If this file exists, but there is no name value, the default core name is the directory name that contains this file.  When this fie doesn't exist, this core will not be auto-detected.

4.  schema.xml
This file defines the structure of your index, including fields and their field type.  This file should be also located at $SOLR_HOME/example/solr/addressDB/conf directory, but it was NOT automatically generated by the command.  I will talk about this file more when I create/run a Solr index.


No comments:

Post a Comment

Java 9: Flow - Reactive Programming

Programming world has always been changed fast enough and many programming / design paradigms have been introduced such as object oriented p...