Knowledge: Sysero Search Lucene relevance scoring
Back
Knowledge
Title*Sysero Search Lucene relevance scoring
ManualAdministration
Manual Level TwoData Rooms
Manual Level ThreeSearch
Created27/03/2018
DetailSearching options
Please be aware that some commands may be automatically applied depending on your system configuration
""
Use quotes to match the exact search term.
AND
Matches all words, for example "one AND two" will match both "one" and "two". "&&" also performs the same function.
NOT
Forces the following term to be excluded, for example "Sysero NOT legal" will search for the word "Sysero" but exclude anything with the term "legal". "!" and "-" perform the same function.
*
Adds multiple wildcard characters, for example "automat*" will return results for "automate", "automation" and "automations".
~
Proximity search.Searches for two terms within a certain word count of one another, for example "Sysero legal"~5 specifies that "Sysero" and "Legal" must be within 5 words of one another.
^
Boosts a term, returning more relevant results for certain terms.For example "Sysero^5 legal" will specify that "Sysero" is 5 times more relevant than "legal" in the results.
()/Grouping
Use brackets to group query terms, for example "(Sysero OR legal) AND ship".
Relevance scoring calculates based on several different factors:
- How often a term appears in a document (note that here a “term” is a single word, unless joined with Boolean operators such as “AND” or using quotes to join multiple words in to a single term.
- How often the term appears across the index as a whole –this is an inverse value and used because the more frequently a word occurs across the entire index, the less likely it is to be helpful or produce relevant results.
- The number of terms in the query that were found in the document. Take the example where you provided 8 terms. If only 6 of those terms appear in the document, then a document with 8 will be scored higher, all other variables considered.
- The number of terms in the document itself.
The summary is:
- Documents containing all the search terms are better
- Matches on rare words are better than common words
- Long documents are less relevant than shorter ones
- Documents which mention the search terms many times are higher
Additional Manual Locations