Sunday, February 14, 2016

Solr: autoCommit, openSearcher and data persistence

On my previous post, I talked about data import and its performance.  At the end of the previous post, I briefly mentioned autoCommit and autoSoftCommit.  On this post, let's talk about it more.

1.  commit VS. softCommit
A softCommit makes document searchable but it doesn't persist document on the disk.   Commit persists documents on the disk and it is more expensive than a softCommit.

2.  searcher
In Solr, queries are processed by a searcher and there is only one "active" searcher.  When you add new documents, the documents will not be searchable until a new searcher, which points data with the new document, becomes an "active" searcher.  Opening a new searcher requires several expensive steps depends on your configurations.


On the solrconfig.xml, we see the following configuration.

<autoCommit> <maxTime>${solr.autoCommit.maxTime:15000}</maxTime> <openSearcher>false</openSearcher> </autoCommit> <autoSoftCommit> <maxTime>${solr.autoSoftCommit.maxTime:-1}</maxTime> </autoSoftCommit>

When you start the Solr using a command of "java -jar start.jar" on an example directory of the installed Solr, ${solr.autoCommit.maxTime} and ${solr.autoSoftCommit.maxTime} don't exist.  Therefore, the autoCommit will starts every 15 seconds and the autoSoftCommit will not happen.

Let's confirm these commit time.  This is a console output after I start the Solr.

3.  openSearcher
On the solrconfig.xml, the default value for the openSearcher element is false.  Let's delete all index data directory and run the full import process.  You will see the following output every 15 seconds on your console.

When the autoCommit runs, it starts with "openSearcher=false" as you can see it on the console.  When you read a size of the data index directory, you will notice that the directory size is increasing.  Nevertheless, you cannot search any data when you run a search query http://localhost:8983/solr/addressDB/select?q=*%3A* because the current "active" searcher cannot see these new documents.

Before the full import is completed, you may terminate the Solr server.  Then, start the Solr server again.  If you run the same query with a q=*:*, you will see the documents persisted on the disk during the autoCommit.

No comments:

Post a Comment

Java 9: Flow - Reactive Programming

Programming world has always been changed fast enough and many programming / design paradigms have been introduced such as object oriented p...