Python OCR 图像识别
官网: http://code.google.com/p/pytesser/
PyTesser is an Optical Character Recognition module for Python. It takes as input an image or image file and outputs a string.
PyTesser uses the Tesseract OCR engine, converting images to an accepted format and calling the Tesseract executable as an external script. A Windows executable is provided along with the Python scripts. The scripts should work in other operating systems as well.
Dependencies
PIL is required to work with images in memory. PyTesser has been tested with Python 2.4 in Windows XP.
Usage Example
>>> from pytesser import * >>> image = Image.open('fnord.tif') # Open image object using PIL >>> print image_to_string(image) # Run tesseract.exe on image fnord >>> print image_file_to_string('fnord.tif') fnord
(more examples in README)
Python-tesseract
官网: http://code.google.com/p/python-tesseract/
(Linux & Mac OS X & Windows)
Python-tesseract is a wrapper class for Tesseract OCR that allows any conventional image files (JPG, GIF ,PNG , TIFF and etc) to be read and decoded into readable languages. No temporary file will be created during the OCR processing.
Windows versions are available now!
remember to
1. set PATH: e.g. PATH=%PATH%;C:PYTHON27 Details
2. set c:python27python.exe to be compatible to Windows 7 even though you are using windows 7. Otherwise the program might crash during runtime Details
3. Download and install all of them
python-opencv numpy
4. unzip the sample code and keep your fingers crossed Sample Codes
5. python -u test.py
it is always safer to run python in unbuffered mode especially for windows XP
Example 1:
import tesseract api = tesseract.TessBaseAPI() api.SetOutputName("outputName"); api.Init(".","eng",tesseract.OEM_DEFAULT) api.SetPageSegMode(tesseract.PSM_AUTO) mImgFile = "eurotext.jpg" pixImage=tesseract.pixRead(mImgFile) api.SetImage(pixImage) outText=api.GetUTF8Text() print("OCR output:n%s"%outText); api.End()
Example 2:
import tesseract api = tesseract.TessBaseAPI() api.Init(".","eng",tesseract.OEM_DEFAULT) api.SetVariable("tessedit_char_whitelist", "0123456789abcdefghijklmnopqrstuvwxyz") api.SetPageSegMode(tesseract.PSM_AUTO) mImgFile = "eurotext.jpg" mBuffer=open(mImgFile,"rb").read() result = tesseract.ProcessPagesBuffer(mBuffer,len(mBuffer),api) print "result(ProcessPagesBuffer)=",result api.End()
Example 3:
import cv2.cv as cv import tesseract api = tesseract.TessBaseAPI() api.Init(".","eng",tesseract.OEM_DEFAULT) api.SetPageSegMode(tesseract.PSM_AUTO) image=cv.LoadImage("eurotext.jpg", cv.CV_LOAD_IMAGE_GRAYSCALE) tesseract.SetCvImage(image,api) text=api.GetUTF8Text() conf=api.MeanTextConf() print text api.End()
Example 4:
import tesseract import cv2 import cv2.cv as cv image0=cv2.imread("p.bmp") #### you may need to thicken the border in order to make tesseract feel happy to ocr your image ##### offset=20 height,width,channel = image0.shape image1=cv2.copyMakeBorder(image0,offset,offset,offset,offset,cv2.BORDER_CONSTANT,value=(255,255,255)) #cv2.namedWindow("Test") #cv2.imshow("Test", image1) #cv2.waitKey(0) #cv2.destroyWindow("Test") ##################################################################################################### api = tesseract.TessBaseAPI() api.Init(".","eng",tesseract.OEM_DEFAULT) api.SetPageSegMode(tesseract.PSM_AUTO) height1,width1,channel1=image1.shape print image1.shape print image1.dtype.itemsize width_step = width*image1.dtype.itemsize print width_step #method 1 iplimage = cv.CreateImageHeader((width1,height1), cv.IPL_DEPTH_8U, channel1) cv.SetData(iplimage, image1.tostring(),image1.dtype.itemsize * channel1 * (width1)) tesseract.SetCvImage(iplimage,api) text=api.GetUTF8Text() conf=api.MeanTextConf() image=None print "..............." print "Ocred Text: %s"%text print "Cofidence Level: %d %%"%conf #method 2: cvmat_image=cv.fromarray(image1) iplimage =cv.GetImage(cvmat_image) print iplimage tesseract.SetCvImage(iplimage,api) #api.SetImage(m_any,width,height,channel1) text=api.GetUTF8Text() conf=api.MeanTextConf() image=None print "..............." print "Ocred Text: %s"%text print "Cofidence Level: %d %%"%conf api.End()
Example 6:
import tesseract import cv2 import cv2.cv as cv image0=cv2.imread("eurotext.jpg") offset=20 height,width,channel = image0.shape image1=cv2.copyMakeBorder(image0,offset,offset,offset,offset,cv2.BORDER_CONSTANT,value=(255,255,255)) api = tesseract.TessBaseAPI() api.Init(".","eng",tesseract.OEM_DEFAULT) api.SetPageSegMode(tesseract.PSM_AUTO) height1,width1,channel1=image1.shape print image1.shape print image1.dtype.itemsize width_step = width*image1.dtype.itemsize print width_step iplimage = cv.CreateImageHeader((width1,height1), cv.IPL_DEPTH_8U, channel1) cv.SetData(iplimage, image1.tostring(),image1.dtype.itemsize * channel1 * (width1)) tesseract.SetCvImage(iplimage,api) api.Recognize(None) ri=api.GetIterator() level=tesseract.RIL_WORD count=0 while (ri): word = ri.GetUTF8Text(level) conf = ri.Confidence(level) print "[%03d]:tword(confidence)=%s(%.2f%%)"%(count,word,conf) #ri.BoundingBox(level,x1,y1,x2,y2) count+=1 if not ri.Next(level): break iplimage=None api.End()
pyocr
官网: https://github.com/jflesch/pyocr
PyOCR is an optical character recognition (OCR) tool wrapper for python. That is, it helps using OCR tools from a Python program.
It has been tested only on GNU/Linux systems. It should also work on similar systems (*BSD, etc). It doesn't work on Windows, MacOSX, etc.
PyOCR can be used as a wrapper for google's Tesseract-OCR or Cuneiform. It can read all image types supported by Pillow, including jpeg, png, gif, bmp, tiff, and others. It also support bounding box data.
Usage
from PIL import Image import sys import pyocr import pyocr.builders tools = pyocr.get_available_tools() if len(tools) == 0: print("No OCR tool found") sys.exit(1) tool = tools[0] print("Will use tool '%s'" % (tool.get_name())) # Ex: Will use tool 'tesseract' langs = tool.get_available_languages() print("Available languages: %s" % ", ".join(langs)) lang = langs[0] print("Will use lang '%s'" % (lang)) # Ex: Will use lang 'fra' txt = tool.image_to_string( Image.open('test.png'), lang=lang, builder=pyocr.builders.TextBuilder() ) word_boxes = tool.image_to_string( Image.open('test.png'), lang=lang, builder=pyocr.builders.WordBoxBuilder() ) line_and_word_boxes = tool.image_to_string( Image.open('test.png'), lang=lang, builder=pyocr.builders.LineBoxBuilder() ) # Digits - Only Tesseract digits = tool.image_to_string( Image.open('test-digits.png'), lang=lang, builder=pyocr.tesseract.DigitBuilder() )
Dependencies
- PyOCR requires python 2.7 or later.
- You will need Pillow or Python Imaging Library (PIL). Under Debian/Ubuntu, PIL is in the package "python-imaging".
-
Install an OCR:
- tesseract-ocr from http://code.google.com/p/tesseract-ocr/ ('tesseract-ocr' + 'tesseract-ocr-<lang>' in Debian). You must be able to invoke the tesseract command as "tesseract". Python-tesseract is tested with Tesseract >= 3.01 only.
- or cuneiform
Installation
$ sudo python ./setup.py install
Tests
$ python ./run_tests.py
Tests are made to be run with the latest versions of Tesseract and Cuneiform. the first tests verify that you're using the expected version.
To run the tests, you will need the following lang support:
- English (tesseract-ocr-eng)
- French (tesseract-ocr-fra)
- Japanese (tesseract-ocr-jpn)
官网:http://code.google.com/p/tesseract-ocr/
Tesseract is probably the most accurate open source OCR engine available. Combined with the Leptonica Image Processing Library it can read a wide variety of image formats and convert them to text in over 60 languages. It was one of the top 3 engines in the 1995 UNLV Accuracy test. Between 1995 and 2006 it had little work done on it, but since then it has been improved extensively by Google. It is released under the Apache License 2.0.
- ReadMe - Installation and usage information.
- Compiling - How to build Tesseract on a variety of platforms.
- FAQ - Common questions and problems. Please check before filing a bug or consulting the forum.
- Too many errors? - See the guidance on getting the best out of Tesseract.
Supported Platforms
Tesseract works on Linux, Windows (with VC++ Express or CygWin) and Mac OSX. See the ReadMe for more details and install instructions. It can also be compiled for other platforms, including Android and the iPhone, though these are not as well tested platforms. See also the AddOnspage for other projects using Tesseract on various platforms.
If you're interested in supporting other platforms or languages, please get in touch with Ray Smith or the Developers.
官网: https://launchpad.net/cuneiform-linux/
Cuneiform is an OCR system originally developed and open sourced by Cognitive technologies. This project aims to create a fully portable version of Cuneiform.
参考推荐:
版权所有: 本文系米扑博客原创、转载、摘录,或修订后发表,最后更新于 2015-11-25 18:51:37
侵权处理: 本个人博客,不盈利,若侵犯了您的作品权,请联系博主删除,莫恶意,索钱财,感谢!
转载注明: Python OCR 图像识别 (米扑博客)