Installing/Building Tesseract for Windows 8
Installing the latest release of Tesseract (3.02.02) on Windows 8 is pretty simple, but you'll have more work to do if you want to get the latest "beta" version (3.03) working on Windows. Don't be daunted however, we've found some easy-to-follow instructions to help you out.
The Tesseract Windows Installer works pretty well and painlessly as long as you want to use v3.02.02, the latest official release.
- Download the latest released version of the Windows installer for Tesseract
- Run the executable file to install. It will install to C:\Program Files (x86)\Tesseract OCR
- Make sure your TESSDATA_PREFIX environment variable is set correctly:
- Go to Control Panel -> System -> Advanced System Settings -> Advanced tab -> Environment Variables... button
- In System variables window scroll down to TESSDATA_PREFIX. If it's not right, select and click Edit...
That's it. Easy right? However if you want to try to newest "beta" release of Tesseract, you'll have to build it.
Building Tesseract from the source code on your computer is a lot more involved and involves downloading and installing more software (assuming you don't already have it) to complete the various steps. It is also the only way (sort of, see Cheating... below) to get the latest beta release of v3.03 for Windows. After trying several different methods we finally found a website with some excellent instructions that (mostly) work. A big eMOP shout-out to Paul Vorbach and his blog vorba.ch.
You'll need to have the following installed on Windows to do this:
- Tortise SVN or some other SVN program.
- Visual Studio 2013 for Windows Desktop (Express is enough). If you have a .edu email address then you should be able to get a free version from Microsoft DreamSpark.
- Follow these excellent directions for building Tesseract 3.03 on Windows 8 from vorba.ch: http://vorba.ch/2014/tesseract-3.03-vs2013.html
- From the Solution Configurations dropdown (at top) select build type (LIB_release is best for regular use of Tesseract)
- Right-click on libtesseract303 (right menu) and select Build (files will be put in C:\Tesseract-Build\tesseract-ocr\vs2013\bin\Win32\LIB_Release)
- Right-click on tesseract (right menu) and select Build (files put in C:\Tesseract-Build\tesseract-ocr\vs2013\bin\Win32\LIB_Release)
Cheating a little
If all that seems a bit daunting however, there is another option: you can using the Tesseract Windows Installer to get v3.02.02 and then download the tesseract.exe file that we built following the above directions. Just use it to replace the tesseract.exe file you have in your current install. I can't guarantee that it will work, but it should.
- Rename your ccurrent tesseract.exe file (in C:\Program Files (x86)\Tesseract OCR\) to something like tesseract-3.02.exe
- Download the tesseract.exe file we built for Tesseract 3.03 and copy it to your Tesseract OCR folder.
Now you should be using Tesseract v3.03.