Nonprofit Organization  Science Accessibility Net

OCR software for mathematical document Logo of InftyReader InftyReader

Skip the navigation


Go to the top of sAccess Net

Go to the top of InftyProject


InftyReader Ver.3 series

Main Features

OCR software to recognize scientific documents including mathematical formulas.

Various output formats: XML, LaTeX, MathML, HTML, MS Word, etc.
ABBY FineReader OCR engine is usable in InftyReader (option) (see Here).

InftyReader is OCR software to recognize scientific documents including mathematical formulae, and to output the recognition results into various file formats: LaTeX, MathML, XHTML, HRTeX, IML and Microsoft Word document. It is developed in the laboratory of M. Suzuki, Faculty of Mathematics, Kyushu University, in collaboration with several cooperation partners.

*InftyReader Ver.3.0.5 (Sep. 1, 2014)

Personal Use License package:

InftyReaderE305.zip (English Edition, about 48MB) -------- Sep. 1, 2014 new
What's new, in the version 3?
For the general information about InftyReader, please read "AboutInftyReaderE.txt" here.

Below is the one year personal use license package:

InftyReaderE305_Year.zip (English Edition, about 48MB) -------- Sep. 1, 2014 new
(One year license edition has no functional limit from the normal version except for the license period.)

Enterprise License package:

InftyReaderE305_Enterprise.zip (English Edition, about 48MB) -------- Sep. 1, 2014 new.
What's the difference from the personal use edition?

License Update. If you purcased InftyReader in 2013/2014, you can use InftyReader Ver.3 by the serial number (and license key) you have. In case you purchased InftyReader in 2012 or before, and wish to use the new version InftyReader Ver. 3 series, you need to get a new serial number for Ver.3 series.
For the users having a normal license of old version purchased during 2010 and 2012, there are discount prices to get a new serial number for InftyReader Ver.3 series. The price depends on the year you purchased the license. For more detail, please see here.

Trial Use. To use InftyReader in the Trial Mode, please see: AboutTrialUse.txt

InftyReader Ver.2.9.7.2 (Nov. 22, 2013)

Below is the final version of the Ver.2.9 series. In case you need to (re-)activate InftyReader Ver.2 series, please use this version.

InftyReaderE2972.zip (English Edition, about 48MB) -------- Nov. 22, 2013

Document for blind users.

Below is the Introduction to InftyReader for blind users given by Prof. John Gardner (Oregon State University & ViewPlus Technology) at the ICCHP Summer University 2011.

Introduction to InftyReader by Prof. John Gardner.

FineReader Plug-in for InftyReader (Option)

The FineReader OCR engine is now available in InftyReader Ver.2.9 series for the recognition of ordinary texts of the documents. To use FineReader in InftyReader, users should purchase a spcial license of FineReader engine for InftyReader and install FineReaderDic below:

1. Please download FineReaderDic-for-Infty.zip (about 94MB) and follow the indication in "HowToInatall.txt" included in the package.

2. Please send the order form: OrderForm-FRDic.txt filled and purchase license(s) of FineReader engine for InftyReader folllowing the indication written in the order form. The price of the FineReader plug-in is 300 USD per one license.

Supported langeages using FineReader plug-in are: Czech, Dutch, English, French, German, Italian, Hungarian, Polish, Romanian, Russian, Slovak, Spanish and Swedish. FineReader plug-in is not used in the recognition of Japanese language.

[IMPORTANT REMARK]
- Please note that the serial key for FineReader plug-in is different from the serial number of InftyReader, and is valid on ONLY ONE COMPUTER, while the serial number of InftyReader is valid in two PC's.
- There is NO TRIAL VERSION of FineReader plug-in. We are very sorry that you can use it only after the purchase.

* Comments about output formats

  1. IML is the default XML file format of the editor "InftyEditor", an authoring tool of math documents developed by InftyProject. InftyEditor provides a very easy user interface to input and edit math expressions together with ordinary texts.
    The English edition of InftyEditor is a free software. Please see the sites of InftyEditor.
  2. In XHTML format, mathematical expressions are output using MathML notation.
  3. HR-TeX is a simplified LaTeX-like notation easier "to read" specially designed for the blinds.

Using InftyEditor, user can correct and edit the recognition results of InftyReader comparing the results with original images, and convert the results into various formats: LaTeX, PDF, XHTML with MathML, etc.

Please note that InftyReader recognizes only <<Black and White>>, <<Binary>> images carefully scanned in either 600DPI or 400 DPI. Please be aware that the program fails to run if the imput image contains gray scale image areas or color image areas even partly.
Image files have to be prepared in either TIFF, GIF, PNG, or BMP format.

* Features

Here are some features of InftyReader since Ver. 2.8 :

  1. It uses the OCR engines of Toshiba Corporation, "ExpressReaderPro", and of MediaDrive Corporation, "WinReader", simultaneously to improve the recognition results of characters in ordinary text areas. (As for the characters and math symbols in formulae, it uses Infty's OCR).
  2. It can recognize tables including math expressions in the cells (in case the ruled lines are not broken),
  3. It can convert PDF files into LaTeX or XHTML(MathML) including mathematical expressions, except for PDF including color images or gray images. (Note that InftyReader can process only black and white binary images)
    It recognizes the page images of PDF files refering to the text information imbedded in PDF.

    Attention: The original PDF should be of high resolution equivalent to 600dpi scanned images. Someimes PDF files existing on the WEB are of low resolution of the level 200dpi images, in order to reduce those file sizes. In such cases, the recognition results will be of very low quality of the level almost useless!

* Caution ---- Important!

  1. Source documents have to be clearly printed.
  2. It should be scanned in in 600dpi (or 400dpi). Usualy, binary images are better for the recognition than color images.
  3. InftyReader erases small noises, segments page images into picture areas, table areas and text areas automatically, and then recognizes text/table areas including mathematical expressions.
    However, to get better recognition results, users are <<recommended>> to erase noises and pictures before the recognition.
  4. In scanning, it is important to adjust the binarization threshold of the scanner so that the number of the touched or broken characters is less than 1% of the total number of the characters in each scanned page image.

* Operating Environment

InftyReader runs on Windows 8, Windows 7, Vista, XP, on a PC equipped with 1GB free memory or more.

Note that it does not run on Windows 98, Me, nor 2000. .

* How to use InftyReader?

  1. Select file(s) or folder.
  2. Input/select output docuent name
  3. Press the "Start" button.

Then, the recognition results of the selected image files are saved in to the file you specified by the "output docuent name". When, you select a folder instead of files, all the image files in the folder of the specified file type (TIF/GIF/PNG/BMP/PDF) are recognized and the results are output into the files having the name(s) of the folders.

If you set check to the "Search Sub Folders" item under the "Option" menu, InftyReader recognizes all the image files in the sub folders of the selected folder. For example, if you select the folder "foldertop" having the subfolder structure below,

  1. foldertop
    |-- subfolder1
    |        |-- a.tif
    |        |-- b.tif
    |
    |-- subfolder2
             |-- c.tif
             |-- d.tif

and if you select the file type "IML" for the output file type, then, you will get the files "subfolder1.iml", "subfolder2.iml" in the folder "foldertop". The recognition results of a.tif and b.tif (resp. c.tif and d.tif) are saved in the file subfolder1.iml (resp. subfolder2.iml, respectively).

If you select LaTeX as output file type, you will get "subfolder1.tex", "subfolder2.tex", and it is similar for other file types HR-TeX and XHTML.

* License

To use InftyReader, please get a license key from sAccessNet -> click here.

As for the trial use, please see: AboutTrialUse.txt

InftyReader is usable under the following license agreement.

(1) You may not modify the software in any manner. You may not reverse engineer, decompile or disassemble the software.
(2) You may not sell the software without making a formal agreement with Science Accessibility Net.
You may distribute the software only free of charge, without modifying the zip-package of the software.
(3) The author shall have no obligation to correct errors and inconveniences of the software.
(4) The author shall not be responsible for any lost and damage caused by the use of the software.
(5) The license is basically limited to personal use, including the case purchased by an institution for specified user. Shared use by a small group members is also allowed. In case an institution uses the software to service a number of clients or to digitize huge numbers of volumes, please use the enterprise version, reading the page here. For more details, please contact us.

* Report

Any report about the software will be welcome.

--------------------------------------
Non Profit Organization
Science Accessibility Net (sAccessNet)
e-mail: support"at"mail.sciaccess.net (Please replace "at" by @.)
URL: http://www.sciaccess.net/
--------------------------------------

 


 TOP of this page