Nonprofit Organization  Science Accessibility Net

OCR software for mathematical document Logo of InftyReader InftyReader

Skip the navigation


Go to the top of sAccess Net

Go to the top of InftyProject


InftyReader Ver.2.9 series and Ver.3 beta

Main Features

OCR software to recognize scientific documents including mathematical formulae

Various output formats: XML,LaTeX,MathML, HTML, HRTeX and MS Word 2007.
ABBY FineReader OCR engine is usable in InftyReader (option).new

InftyReader is OCR software to recognize scientific documents including mathematical formulae, and to output the recognition results into various file formats: LaTeX, MathML, XHTML, HRTeX, IML. It is developed in the laboratory of M. Suzuki, Faculty of Mathematics, Kyushu University, in collaboration with several cooperation partners.

*InftyReader Ver.3.0 beta2 (April. 30, 2013)

InftyReaderE30Beta2.zip (English Edition, about 48MB) -------- April. 30, 2013new
Usable until June 30 (Free use is limited to 15 days in total.)
What's new?
How to use it?

InftyReader Ver.2.9.5.2 (April. 30, 2013) --- Product version

This is a full setup package of InftyReader.
This installer includes the necessary OCR Dictionaries inside, so the users need not to install InftyReaderDicKitA nor InftyReaderDicKitB.

InftyReaderE2952.zip (English Edition, about 48MB) -------- April. 30, 2013new
Trial use: Freely usable for 15 days in total.

Below is the one year license edition:

InftyReaderE2952_Year.zip (English Edition, about 48MB) -------- April. 30, 2013new
Trial use: Freely usable for 15 days in total.
(One year license edition has no functional limit from the normal version except for the license period.)

FineReader Plug-in for InftyReader (Option)

The FineReader OCR engine is now available in InftyReader Ver.2.9 series for the recognition of ordinary texts of the documents. To use FineReader in InftyReader, users should purchase a spcial license of FineReader engine for InftyReader and install FineReaderDic below:

1. Please download FineReaderDic-for-Infty.zip(about 94MB) and follow the indication in "HowToInatall.txt" included in the package.

2. Please send the order form: OrderForm-FRDic.txt filled and purchase license(s) of FineReader engine for InftyReader folllowing the indication written in the order form. The price of FineReader license is 300 USD per one license.

Supported langeages using FineReader plug-in are: Czech, Dutch, English, French, German, Italian, Hungarian, Polish, Romanian, Russian, Slovak, Spanish and Swedish.
FineReader plug-in is not used in the recognition of Japanese language.

[IMPORTANT REMARK]
- Please note that the serial key for FineReader plug-in is different from the serial number of InftyReader, and is valid on ONLY ONE COMPUTER, while the serial number of InftyReader is valid in two PC's.
- There is NO TRIAL VERSION of FineReader plug-in. We are very sorry that you can use it only after the purchase.
- In InftyReader Ver.2.9.3, Russian recognition is not yet supported.

* Comments about output formats

  1. IML is the default XML file format of the editor "InftyEditor", an authoring tool of math documents developed by InftyProject. InftyEditor provides a very easy user interface to input and edit math expressions together with ordinary texts.
    The English edition of InftyEditor is a free software. Please see the sites of InftyEditor.
  2. In XHTML format, mathematical expressions are output using MathML notation.
  3. HR-TeX is a simplified LaTeX-like notation easier "to read" specially designed for the blinds.

Using InftyEditor, user can correct and edit the recognition results of InftyReader comparing the results with original images, and convert the results into various formats: LaTeX, PDF, XHTML with MathML, etc.

Please note that InftyReader recognizes only <<Black and White>>, <<Binary>> images carefully scanned in either 600DPI or 400 DPI. Please be aware that the program fails to run if the imput image contains gray scale image areas or color image areas even partly.
Image files have to be prepared in either TIFF, GIF, PNG, or BMP format.

* Features

Here are some features of InftyReader Ver. 2.8 :

  1. It uses the OCR engines of Toshiba Corporation, "ExpressReaderPro", and of MediaDrive Corporation, "WinReader", simultaneously to improve the recognition results of characters in ordinary text areas. (As for the characters and math symbols in formulae, it uses Infty's OCR).
  2. It can recognize tables including math expressions in the cells (in case the ruled lines are not broken),
  3. It can convert PDF files into LaTeX or XHTML(MathML) including mathematical expressions, except for PDF including color images or gray images. (Note that InftyReader can process only black and white binary images)
    It recognizes the page images of PDF files refering to the text information imbedded in PDF.

    Attention: The original PDF should be of high resolution equivalent to 600dpi scanned images. Someimes PDF files existing on the WEB are of low resolution of the level 200dpi images, in order to reduce those file sizes. In such cases, the recognition results will be of very low quality of the level almost useless!

* Caution ---- Important!

  1. Source documents have to be clearly printed.
  2. It should be scanned in "binary" image, in 600dpi (or 400dpi).
  3. InftyReader erases small noises, segments page images into picture areas, table areas and text areas automatically, and then recognizes text/table areas including mathematical expressions.
    However, to get better recognition results, users are <<recommended>> to erase noises and pictures before the recognition.
  4. In scanning, it is important to adjust the binarization threshold of the scanner so that the number of the touched or broken characters is less than 1% of the total number of the characters in each scanned page image.

* Operating Environment

InftyReader runs on Windows 7, Vista, XP, on a PC equipped with 500MB free memory or more.

Note that it does not run on Windows 98, Me, nor 2000. .

* How to use InftyReader?

  1. Select file(s) or folder.
  2. Input/select output docuent name
  3. Press the "Start" button.

Then, the recognition results of the selected image files are saved in to the file you specified by the "output docuent name". When, you select a folder instead of files, all the image files in the folder of the specified file type (TIF/GIF/PNG/BMP/PDF) are recognized and the results are output into the files having the name(s) of the folders.

If you set check to the "Search Sub Folders" item under the "Option" menu, InftyReader recognizes all the image files in the sub folders of the selected folder. For example, if you select the folder "foldertop" having the subfolder structure below,

  1. foldertop
    |-- subfolder1
    |        |-- a.tif
    |        |-- b.tif
    |
    |-- subfolder2
             |-- c.tif
             |-- d.tif

and if you select the file type "IML" for the output file type, then, you will get the files "subfolder1.iml", "subfolder2.iml" in the folder "foldertop". The recognition results of a.tif and b.tif (resp. c.tif and d.tif) are saved in the file subfolder1.iml (resp. subfolder2.iml, respectively).

If you select LaTeX as output file type, you will get "subfolder1.tex", "subfolder2.tex", and it is similar for other file types HR-TeX and XHTML.

* License

InftyReader Ver.2.7 series is usable free of charge for 15 days in total after the installation.
If you use three days per week, for example, then you can use the software for 5 weeks on trial.

For further use, please get a license key from sAccessNet -> click here.

InftyReader is usable under the following license agreement.

(1) You may not modify the software in any manner. You may not reverse engineer, decompile or disassemble the software.
(2) You may not sell the software without making a formal agreement with Science Accessibility Net.
You may distribute the software only free of charge, without modifying the zip-package of the software.
(3) The author shall have no obligation to correct errors and inconveniences of the software.
(4) The author shall not be responsible for any lost and damage caused by the use of the software.

* Report

Any report about the software will be welcome.

--------------------------------------
Non Profit Organization
Science Accessibility Net (sAccessNet)
e-mail: support"at"mail.sciaccess.net (Please replace "at" by @.)
URL: http://www.sciaccess.net/
--------------------------------------

 


 TOP of this page