问题描述
上面是图片,我已经尝试了从 SO 或 google 获得的一切,似乎没有任何效果.我无法在图像中得到确切的值,我应该得到 2.10,而不是总是得到 210.
并且不仅限于此图像,只有在数字 1 tesseract 之前具有小数的任何图像都会忽略十进制值.
def returnAllowedAmount(self,imgpath):th = 127最大值 = 255img = cv2.imread(imgpath,0) #在内存中加载图片img = cv2.resize(img, None, fx=2.5, fy=2.5, interpolation=cv2.INTER_CUBIC) #rescale Imageimg = cv2.medianBlur(img, 1)ret , img = cv2.threshold(img,th,max_val,cv2.THRESH_TOZERO)self.showImage(img)returnData = pytesseract.image_to_string(img,lang='eng',config='-psm 13' )returnData = ''.join(p for p in returnData if p.isnumeric() or p == ".") # REMOVE $ SIGN
在将图像放入 Pytesseract 之前,一些预处理以清理/平滑图像会有所帮助.这是一个简单的方法
- 将图像转换为灰度并放大图像
- 门槛
- 执行形态学操作以清洁图像
- 反转图像
首先我们将图像转换为灰度,使用
现在我们执行
现在我们为 Pytesseract 反转图像并添加高斯模糊
我们使用 --psm 10 配置标志,因为我们希望将图像视为单个字符.以下是一些可能有用的额外配置标志
结果
<块引用>2.10 美元
过滤后
<块引用>2.10
导入 cv2导入 pytesseract导入 imutilspytesseract.pytesseract.tesseract_cmd = r"C:Program FilesTesseract-OCR esseract.exe"图像 = cv2.imread('1.png',0)图像 = imutils.resize(图像,宽度 = 300)thresh = cv2.threshold(图像, 150, 255, cv2.THRESH_BINARY_INV)[1]内核 = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (3,3))关闭 = cv2.morphologyEx(阈值,cv2.MORPH_CLOSE,内核)结果 = 255 - 关闭结果 = cv2.GaussianBlur(结果, (5,5), 0)数据 = pytesseract.image_to_string(结果,lang='eng',config='--psm 10')processes_data = ''.join(char for char in data if char.isnumeric() or char == '.')打印(数据)打印(已处理数据)cv2.imshow('thresh', thresh)cv2.imshow('关闭',关闭)cv2.imshow('结果', 结果)cv2.waitKey()
Above is the image ,I have tried everything I could get from SO or google ,nothing seems to work. I can not get the exact value in image , I should get 2.10 , Instead it always get 210.
And it is not limited to this image only any image which have a decimal before number 1 tesseract ignores the decimal value.
def returnAllowedAmount(self,imgpath): th = 127 max_val = 255 img = cv2.imread(imgpath,0) #Load Image in Memory img = cv2.resize(img, None, fx=2.5, fy=2.5, interpolation=cv2.INTER_CUBIC) #rescale Image img = cv2.medianBlur(img, 1) ret , img = cv2.threshold(img,th,max_val,cv2.THRESH_TOZERO) self.showImage(img) returnData = pytesseract.image_to_string(img,lang='eng',config='-psm 13 ' ) returnData = ''.join(p for p in returnData if p.isnumeric() or p == ".") # REMOVE $ SIGN
Before throwing the image into Pytesseract, some preprocessing to clean/smooth the image helps. Here's a simple approach
- Convert image to grayscale and enlarge image
- Threshold
- Perform morphological operations to clean image
- Invert image
First we convert the image to grayscale, resize using the imutils library then threshold to obtain a binary image
Now we perform morphological transformations to smooth the image
Now we invert the image for Pytesseract and add a Gaussian blur
We use the --psm 10 config flag since we want to treat the image as a single character. Here's some additional configuration flags that could be useful
Results
$2.10
After filtering
2.10
import cv2 import pytesseract import imutils pytesseract.pytesseract.tesseract_cmd = r"C:Program FilesTesseract-OCR esseract.exe" image = cv2.imread('1.png',0) image = imutils.resize(image, width=300) thresh = cv2.threshold(image, 150, 255, cv2.THRESH_BINARY_INV)[1] kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (3,3)) close = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, kernel) result = 255 - close result = cv2.GaussianBlur(result, (5,5), 0) data = pytesseract.image_to_string(result, lang='eng',config='--psm 10 ') processed_data = ''.join(char for char in data if char.isnumeric() or char == '.') print(data) print(processed_data) cv2.imshow('thresh', thresh) cv2.imshow('close', close) cv2.imshow('result', result) cv2.waitKey()