Should ever the need to compile Tesseract from SVN arise (version v3.01 at the time of the writing) Please note:
In order to fetch the source issue:
bash-3.2$ svn checkout http://tesseract-ocr.googlecode.com/svn/trunk/ tesseract-ocr-read-only
you have to install Leptonica beforehand (or via macports like me)
bash-3.2$ sudo port install leptonica
if you want to use autotools and libtool from macports (again like me) you’ll have to hack the runautoconf in the tesseract source directory (‘tesseract-ocr-read-only\‘) prior to running it to call glibtoolize instead of libtoolize, rumor has it that libtoolize has been renamed to glibtoolize by the MacrPorts maintainers to avoid eclipsing the apple /usr/bin/libtoolize from apple (that conveniently enough is not compatible with it’s GNU counter part). following is the modified line in runautoconf:
. . . echo "Running libtoolize" glibtoolize . . .
The next step is to run the modified runautoconf:
bash-3.2$ ./runautoconf
next you’ll have to hack the tesseract ./configure script to include where macports installs leptonica (which is /opt/local/include/leptonica)
. . . have_lept=no if test "$LIBLEPT_HEADERSDIR" = "" ; then LIBLEPT_HEADERSDIR="/usr/local/include /usr/include /opt/local/include/leptonica" fi . . .
if you skip or mess up the previous step you’ll see the following error when runnig ./configure:
bash-3.2$ ./configure checking build system type... i686-apple-darwin9.8.0 . . checking for Leffler libtiff library... checking linking with -ltiff... ok setting LIBTIFF_CFLAGS= setting LIBTIFF_LIBS=-ltiff checking for leptonica... configure: error: leptonica not found
it’s easy to forget to run runautoconf script before running ./configure.
and … offcourse call 'make'
note that it is also important to call
sudo make install
in order for the language files to be copied to the location Tesseract expects to find them at.
All looked fine (sort of) until I did the final make install and got this:
$ sudo make install
Making install in ccstruct
/bin/sh ../libtool –tag=CXX –mode=compile g++ -DHAVE_CONFIG_H -I. -I.. -I../ccutil -I../cutil -I../image -I../viewer -I/opt/local/include -I/opt/local/include/leptonica/. -g -O2 -MT blobbox.lo -MD -MP -MF .deps/blobbox.Tpo -c -o blobbox.lo blobbox.cpp
mv -f .deps/blobbox.Tpo .deps/blobbox.Plo
mv: rename .deps/blobbox.Tpo to .deps/blobbox.Plo: No such file or directory
make[2]: *** [blobbox.lo] Error 1
make[1]: *** [install-recursive] Error 1
make: *** [install-recursive] Error 1
not sure what blobbox.Plo is but I do have such a file in my tessearct source directory.
cd temp/ocr/tesseract-ocr-read-only
computer:tesseract-ocr-read-only mcradle$ find ./ -iname “blobbox.Plo”
//ccstruct/.deps/blobbox.Plo
This is the Problem, because the script want’s to copy the Tpo to the Plo file but can’t find the Tpo. When copying same Plo to Tpo it compiles for quit some time.
saved me some time here, thanks!
Pingback: Compiling tesseract OCR | Zen of Linux
Just wondering why I can’t compile in my optware device, saved me! Thx
Pingback: goes Zen » Compiling tesseract OCR
Seems that in the current svn checkout of 3.0.2 revision 725+ the file runautoconfig is changed to autogen.sh and configure is renamed to configure.ar ! the glibtoolize change seems to be replected in the script!
Seems that in the current SVN checkout rev 725 of version 3.0.2 the mentioned files are not present. instead of configure, there is a file named configure.ar (ar aka autorun?) and runautoconfig seems to be replaced by autogen.sh (containing the ‘ echo “Running libtoolize”‘ line.
In the following lines “glibtoolize” seems to be recognized!
If I gain sucess! I come back later to document this!
Update (tested on MacOSX 10.5.8)
As expected: first run .autogen.sh as described in the INSTALL doks to create the “confige” file. There will be no runautoconfig anymore! It seems now to be enough to hack the tesseract ./configure script to include where macports installs leptonica by adding “/opt/local/include/leptonica” to the LIBLEPT_HEADERSDIR. Then continue as desribed.
Do not forget to set the environment variable for tessdata:
e.g.
export TESSDATA_PREFIX=”/usr/local/share/”
Please make sure the TESSDATA_PREFIX environment variable is set to the parent directory of your “tessdata” directory.
and copy the necessary languagedata from
../tesseract-ocr-read-only/tessdata/[mylanguage].traineddata
to
/usr/local/share/tessdata/
Thanks I’m also running 10.5.8 and when I’ll go back to working on tesseract your comments will come in handy!
thnx alot ,,you saved my day