ocr.rst
author Oleksandr Gavenko <gavenkoa@gmail.com>
Mon, 22 Feb 2016 12:41:52 +0200
changeset 1903 901e7394849f
parent 1346 a2fbf50a43f4
child 1905 fba288d59662
permissions -rw-r--r--
Decrease intent to increase space usage on mobile.
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
1334
9bf0d5a1f0cf Include common header with quick links.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents: 1136
diff changeset
     1
.. -*- coding: utf-8; -*-
9bf0d5a1f0cf Include common header with quick links.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents: 1136
diff changeset
     2
.. include:: HEADER.rst
1136
8d9c9a102827 About OCR program.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff changeset
     3
8d9c9a102827 About OCR program.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff changeset
     4
======
8d9c9a102827 About OCR program.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff changeset
     5
 OCS.
8d9c9a102827 About OCR program.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff changeset
     6
======
1346
a2fbf50a43f4 Fix: Has no 'contents::' directive.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents: 1334
diff changeset
     7
.. contents::
1136
8d9c9a102827 About OCR program.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff changeset
     8
8d9c9a102827 About OCR program.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff changeset
     9
gocr.
8d9c9a102827 About OCR program.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff changeset
    10
=====
8d9c9a102827 About OCR program.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff changeset
    11
8d9c9a102827 About OCR program.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff changeset
    12
  $ gocr $IN.pnm >$OUT.txt
8d9c9a102827 About OCR program.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff changeset
    13
8d9c9a102827 About OCR program.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff changeset
    14
ocrfeeder.
8d9c9a102827 About OCR program.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff changeset
    15
==========
8d9c9a102827 About OCR program.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff changeset
    16
8d9c9a102827 About OCR program.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff changeset
    17
Document layout analysis and optical character recognition system::
8d9c9a102827 About OCR program.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff changeset
    18
8d9c9a102827 About OCR program.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff changeset
    19
  $ sudo apt-get install ocrfeeder
8d9c9a102827 About OCR program.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff changeset
    20
8d9c9a102827 About OCR program.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff changeset
    21
Using::
8d9c9a102827 About OCR program.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff changeset
    22
8d9c9a102827 About OCR program.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff changeset
    23
  $ ocrfeeder-cli --o $OUTDIR --format HTML --images $IN.pnm
8d9c9a102827 About OCR program.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff changeset
    24
8d9c9a102827 About OCR program.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff changeset
    25
tesseract.
8d9c9a102827 About OCR program.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff changeset
    26
==========
8d9c9a102827 About OCR program.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff changeset
    27
8d9c9a102827 About OCR program.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff changeset
    28
Installing::
8d9c9a102827 About OCR program.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff changeset
    29
8d9c9a102827 About OCR program.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff changeset
    30
  $ sudo apt-get install tesseract-ocr
8d9c9a102827 About OCR program.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff changeset
    31
8d9c9a102827 About OCR program.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff changeset
    32
Using::
8d9c9a102827 About OCR program.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff changeset
    33
8d9c9a102827 About OCR program.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff changeset
    34
  $ tesseract $IN.tif $OUT
8d9c9a102827 About OCR program.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff changeset
    35
  $ cat $OUT.txt
8d9c9a102827 About OCR program.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff changeset
    36
8d9c9a102827 About OCR program.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff changeset
    37
ocropus.
8d9c9a102827 About OCR program.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff changeset
    38
========
8d9c9a102827 About OCR program.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff changeset
    39
8d9c9a102827 About OCR program.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff changeset
    40
  $ ocropus hocr-to-text screen.ppm
8d9c9a102827 About OCR program.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff changeset
    41
8d9c9a102827 About OCR program.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff changeset
    42
ocrad
8d9c9a102827 About OCR program.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff changeset
    43
=====
8d9c9a102827 About OCR program.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff changeset
    44
8d9c9a102827 About OCR program.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff changeset
    45
Optical Character Recognition program::
8d9c9a102827 About OCR program.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff changeset
    46
8d9c9a102827 About OCR program.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff changeset
    47
  $ sudo apt-get install ocrad
8d9c9a102827 About OCR program.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff changeset
    48
8d9c9a102827 About OCR program.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff changeset
    49
Misc.
8d9c9a102827 About OCR program.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff changeset
    50
=====
8d9c9a102827 About OCR program.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff changeset
    51
8d9c9a102827 About OCR program.
Oleksandr Gavenko <gavenkoa@gmail.com>
parents:
diff changeset
    52
unpapper