Downloaded PDFs searchable
The OCR - warts and all! - enables pages to be found - but when the PDF is downloaded it is just an image with no underlying OCR layer. Can the OCR not be included in the PDF?
![](https://secure.gravatar.com/avatar/51bfd1c519dec5419f59c7c92012fe7d?size=40&default=https%3A%2F%2Fassets.uvcdn.com%2Fpkg%2Fadmin%2Ficons%2Fuser_70-6bcf9e08938533adb9bac95c3e487cb2a6d4a32f890ca6fdc82e3072e0ea0368.png)
![](https://secure.gravatar.com/avatar/7b06b0fb6dadc036bae25418b2e1b280?size=40&default=https%3A%2F%2Fassets.uvcdn.com%2Fpkg%2Fadmin%2Ficons%2Fuser_70-6bcf9e08938533adb9bac95c3e487cb2a6d4a32f890ca6fdc82e3072e0ea0368.png)
We’ll find out how hard this is to do
-
J commented
Absolutely. It seems ludicrous that the download gives NO IDEA of where you were looking. As it is, it would be better as an image that I could run through PROPER OCR and get a better transcription.
It would need to include the corrections that kind hearted people have made.The cheap scanner I have lets me save a scan as a searchable pdf. How hard can it be?
How can it take 6 years to review how hard this improvement might be?
-
Josh Renaud commented
I agree, I was very disappointed that my downloaded PDF didn't include the OCRed text. Worse, the JPEG compression in the PDF was bad enough that I couldn't get any of my own OCR software to recognize the text.
Very disappointed in the PDFs.