Downloaded PDFs searchable
The OCR - warts and all! - enables pages to be found - but when the PDF is downloaded it is just an image with no underlying OCR layer. Can the OCR not be included in the PDF?
We’ll find out how hard this is to do
-
J commented
Absolutely. It seems ludicrous that the download gives NO IDEA of where you were looking. As it is, it would be better as an image that I could run through PROPER OCR and get a better transcription.
It would need to include the corrections that kind hearted people have made.The cheap scanner I have lets me save a scan as a searchable pdf. How hard can it be?
How can it take 6 years to review how hard this improvement might be?
-
Josh Renaud commented
I agree, I was very disappointed that my downloaded PDF didn't include the OCRed text. Worse, the JPEG compression in the PDF was bad enough that I couldn't get any of my own OCR software to recognize the text.
Very disappointed in the PDFs.