Knowledge: Configuring OCR for indexing scanned PDF file text
    Title*Configuring OCR for indexing scanned PDF file text
    Manual Level TwoSearch
    DetailOCR text extraction settings can be managed under Admin -> System -> OCR settings.

    Please note that the processing does not run by default and will need to be enabled. It also doesn't run when a document is uploaded but instead documents uploaded that require OCR will be added to the Index queue. The Index queue processing can be initiated manually on the System page under "Search settings", or run automatically on a timed regular basis using the job scheduler. 

    Once enabled and running, the OCR operation has multiple additional options to increase clarity depending on the requirements on the System page mentioned above. If you find that your document text recognition isn't sufficiently accurate then you can try enabling some of the options below:

    • Page count - by default the process will run on the first 10 pages of each document but this can be increased as required at the expense of processing time.
    • Thread count - can increase performance if increased but also increases hardware requirements to achieve that result.
    • Auto denoise - this will add significant processing time but is useful for documents with markings or clutter obscuring the text.
    • Detect areas - this will add significant processing time as well as increase hardware requirements for the process to run, but will dramatically increase accuracy.
    • Auto contrast - this is useful when text and background are a closer colour and text is perhaps harder to make out as a result.
    • Auto skew - this is useful when documents are scanned at more of an angle
    • Upscale font - this is useful when documents have very small font

    All the above settings will decrease performance/increase required resources somewhat so it is wise not to enable everything unless absolutely required. 
    Privacy Policy
    Cookies help us to improve your user experience. By using this site you consent to cookies being stored on your device. Read more...
    Back to Top
    View or hide all system messages