Web Site performance and OCR
Firstly I think you are doing a great thing with this archive; good job!
However my experience of trying to contribute to your project in a small way and correct the plethora of errors which your very basic OCR generates in the articles I am researching has been very frustrating. Reviewing and correcting a few thousand words is taking hours when it should take maybe only an hour if the performance of the website was quicker and if the user interface for editing had been better designed. In Microsoft Word or some similar software I can see all the errors and easily correct them but the tortuous line by line interface on your website and the fact that each time you save you are taken back to the beginning of the article makes editing a real pain. Worse; during a long editing session, the performance of the website drops to a snail's pace with each character taking a long time to enter. Personally I think you should spend some money on this because however extensive the content, if you can't easily read and edit, the whole experience is frustrating. You also should look at the kind of AI that has been available for a while -to avoid your OCR making repeated errors like tbe instead of the. Its just requires some context sensitive logic and I'm certain there are solutions available. Very best of luck with it and I will persevere :) Simon