Thursday, May 30, 2013

How does Solr Sort Documents when their Score is Same

We have had cases when same keyword Search gives us results which are ordered randomly.

On digging deeper, it seems that if Score of documents are same then Solr is sorting them based on their internal DocId or the time when the documented was indexed.

To demonstrate this, I used a very simple schema:

Added few documents like this

<doc>
<str name="id">201</str>
<int name="rank_id">20111</int>
<str name="name">Perl Harbor</str>
</doc>

<doc>
<str name="id">2022</str>
<int name="rank_id">20111</int>
<str name="name">Perl Harbor</str>
</doc>

<doc>
<str name="id">1</str>
<int name="rank_id">20</int>
<str name="name">Perl Harbor</str>
</doc>

<doc>
<str name="id">2</str>
<int name="rank_id">21</int>
<str name="name">Perl Harbor</str>
</doc>


When doing a simple query, 


I get following documents which are in the order in which they were added

<doc>
<str name="id">201</str>
<int name="rank_id">20111</int>
<str name="name">Perl Harbor</str>
</doc>

<doc>
<str name="id">2022</str>
<int name="rank_id">20111</int>
<str name="name">Perl Harbor</str>
</doc>

<doc>
<str name="id">1</str>
<int name="rank_id">20</int>
<str name="name">Perl Harbor</str>
</doc>

<doc>
<str name="id">2</str>
<int name="rank_id">21</int>
<str name="name">Perl Harbor</str>
</doc>


This indicates that order is based on the order in which they were added to Solr.

You can see same question being asked here

No comments: