Showing posts with label Solr. Show all posts
Showing posts with label Solr. Show all posts

Thursday, May 30, 2013

How does Solr Sort Documents when their Score is Same

We have had cases when same keyword Search gives us results which are ordered randomly.

On digging deeper, it seems that if Score of documents are same then Solr is sorting them based on their internal DocId or the time when the documented was indexed.

To demonstrate this, I used a very simple schema:

Added few documents like this

<doc>
<str name="id">201</str>
<int name="rank_id">20111</int>
<str name="name">Perl Harbor</str>
</doc>

<doc>
<str name="id">2022</str>
<int name="rank_id">20111</int>
<str name="name">Perl Harbor</str>
</doc>

<doc>
<str name="id">1</str>
<int name="rank_id">20</int>
<str name="name">Perl Harbor</str>
</doc>

<doc>
<str name="id">2</str>
<int name="rank_id">21</int>
<str name="name">Perl Harbor</str>
</doc>


When doing a simple query, 


I get following documents which are in the order in which they were added

<doc>
<str name="id">201</str>
<int name="rank_id">20111</int>
<str name="name">Perl Harbor</str>
</doc>

<doc>
<str name="id">2022</str>
<int name="rank_id">20111</int>
<str name="name">Perl Harbor</str>
</doc>

<doc>
<str name="id">1</str>
<int name="rank_id">20</int>
<str name="name">Perl Harbor</str>
</doc>

<doc>
<str name="id">2</str>
<int name="rank_id">21</int>
<str name="name">Perl Harbor</str>
</doc>


This indicates that order is based on the order in which they were added to Solr.

You can see same question being asked here

Saturday, October 27, 2012

Solr Schema Changes - Breaking Vs Non-Breaking



There are times when your Solr based application needs to be extended, in terms of adding new fields or updating existing fields' definitions or deleting existing fields.

Whenever we run into these scenarios, one of the most important question that needs to be answered is, does this change need existing index to be deleted and recreated or is it as simple as updating schema without any deletion & re-indexing?

If a change needs index to be deleted and all docs to be re-indexed then this is something I call as a breaking change. One such case is when field omitNorms is changed to be true/false. This impacts all the documents and unless all docs are deleted followed by re-indexing, index will still have older information.

Changes like adding a new field or deleting a field is an easy one to deal with. These changes do not require all docs to be deleted. It is handled nicely by Solr. All newly added docs will follow new schema. This is what i refer as non-breaking change.

I hope this clarifies some of questions which people may have about the impact of making changes to Solr Schema.