Skip to content

Why are there mistakes in the article text ?

When we digitise newspapers, we use computers to read the words on the page and make sense of them - this process is called Optical Character Recognition or OCR. Although machines are not as good as humans at reading text, it would cost far too much and take far too long for people to read and retype the newspaper content.

The layout and quality of the newspaper image also has an impact on how good the generated text is. When the newspaper image is very clear and the size of the type is large, our results are generally very good. If the image is reasonably hard to read for a human, a machine will have similar issues in correctly interpreting the characters.

We stay up to date with the latest advances in OCR technologies and are constantly trying to improve the accuracy of the text that we generate. We also encourage our customers to help others find articles by correcting the text of the newspaper articles that they come across, especially the names of people and places. You can do this directly from the image viewer when you are viewing a page.

Feedback and Knowledge Base