Knowledge: Sysero Search Lucene relevance scoring
Back
    Title*Sysero Search Lucene relevance scoring
    ManualAdministration
    Manual Level TwoData Rooms
    Manual Level ThreeSearch
    Created27/03/2018
    Detail

    Searching options

    Please be aware that some commands may be automatically applied depending on your system configuration
    ""
    Use quotes to match the exact search term.
    AND
    Matches all words, for example "one AND two" will match both "one" and "two". "&&" also performs the same function.
    NOT
    Forces the following term to be excluded, for example "Sysero NOT legal" will search for the word "Sysero" but exclude anything with the term "legal". "!" and "-" perform the same function.
    *
    Adds multiple wildcard characters, for example "automat*" will return results for "automate", "automation" and "automations".
    ~
    Proximity search.Searches for two terms within a certain word count of one another, for example "Sysero legal"~5 specifies that "Sysero" and "Legal" must be within 5 words of one another.
    ^
    Boosts a term, returning more relevant results for certain terms.For example "Sysero^5 legal" will specify that "Sysero" is 5 times more relevant than "legal" in the results.
    ()/Grouping
    Use brackets to group query terms, for example "(Sysero OR legal) AND ship".

    Relevance scoring calculates based on several different factors:

    • How often a term appears in a document (note that here a “term” is a single word, unless joined with Boolean operators such as “AND” or using quotes to join multiple words in to a single term.
    • How often the term appears across the index as a whole –this is an inverse value and used because the more frequently a word occurs across the entire index, the less likely it is to be helpful or produce relevant results.
    • The number of terms in the query that were found in the document. Take the example where you provided 8 terms. If only 6 of those terms appear in the document, then a document with 8 will be scored higher, all other variables considered.
    • The number of terms in the document itself.

    The summary is:

    • Documents containing all the search terms are better
    • Matches on rare words are better than common words
    • Long documents are less relevant than shorter ones
    • Documents which mention the search terms many times are higher
    Privacy Policy
    Cookies help us to improve your user experience. By using this site you consent to cookies being stored on your device. Read more...
    Back to Top
    View or hide all system messages