CHAPTER 1
OPTICAL CHARACTER RECOGNITION
1.1 INTRODUCTION
Optical character recognition, usually abbreviated to OCR, is the mechanical or electronic
conversion of scanned images of handwritten, typewritten or printed text into machine-encoded
text. It is widely used as a form of data entry from some sort of original paper data source, whether
documents, sales receipts, mail, or any number of printed records. It is a common method of
digitizing printed texts so that they can be electronically searched, stored more compactly, displayed
on-line, and used in machine processes such as machine translation, text-to-speech and text mining.
All OCR systems include an optical scanner for reading text, and sophisticated software for
analyzing images. Most OCR systems use a combination of hardware (specialized circuit boards)
and software to recognize characters, although some inexpensive systems do it entirely through
software. Advanced OCR systems can read text in large variety of fonts, but they still have
difficulty with handwritten text.
The potential of OCR systems is enormous because they enable users to harness the power of
computers to access printed documents. OCR is already being used widely in the legal profession,
where searches that once required hours or days can now be accomplished in a few seconds.
OCR is a field of research in pattern recognition, artificial intelligence and computer vision.
Early versions needed to be programmed with images of each character, and worked on one font at
a time. "Intelligent" systems with a high degree of recognition accuracy for most fonts are now
common. Some systems are capable of reproducing formatted output that closely approximates the
original scanned page including images, columns and other non-textual components.
2
1.2 COMPONENTS OF OCR SYSTEM
Figure 1.1: Block Diagram of OCR
1. Optical Scanning:
In OCR, optical scanners are used, which generally consist of a transport mechanism plus a
sensing device that converts light intensity into gray-levels.
Printed documents usually consist of black print on a white background.
So, we convert the multilevel image into a bi-level image of black and white. This process is
known as thresholding (performed on the scanner to save memory space and computational
effort).
A fixed threshold is used, where gray-levels below this threshold is said to be black and
levels above are said to be white.
For a high-contrast document with uniform background, a prechosen fixed threshold can be
sufficient.
However, a lot of documents encountered in practice have a rather large range in contrast.
In these case we usually use a method is able to vary the threshold over the document
adapting to the local properties as contrast and brightness.
However, such methods usually depend upon a multilevel scanning of the document which
requires more memory and computational capacity.
3
Therefore such techniques are seldom used in connection with OCR systems, although they
result in better images.
2. Location Segmentation:
Segmentation - Process that determines the constituents of an image.
It is necessary to locate the regions of the document where data have been printed and
distinguish them from figures and graphics.
For instance, when performing automatic mail-sorting, the address must be located and
separated from other print on the envelope.
The majority of optical character recognition algorithms segment the words into isolated
characters which are recognized individually.
Usually this segmentation is performed by isolating each connected component, that is each
connected black area
3. Pre processing:
Image resulting from the scanning process may contain a certain amount of noise.
Depending on the resolution on the scanner and the success of the applied technique for
thresholding, the characters may be smeared or broken.
Some of these defects can be eliminated by using a preprocessor to smooth the digitized
characters.
The smoothing implies both filling and thinning.
Filling eliminates small breaks, gaps and holes in the digitized characters, while thinning
reduces the width of the line
4. Feature Extraction:
To capture the essential characteristics of the symbols.
The most straight forward way of describing a character is by the actual raster image.
Another approach is to extract certain features that still characterize the symbols, but leaves
out the unimportant attributes.
Techniques: The distribution of points, Transformations and series expansions, Structural
analysis.
4
5. Post-processing:
OCR accuracy can be increased if the output is constrained by a lexicon – a list of words
that are allowed to occur in a document. This might be, for example, all the words in the
English language, or a more technical lexicon for a specific field. This technique can be
problematic if the document contains words not in the lexicon, like proper nouns. Tesseract
uses its dictionary to influence the character segmentation step, for improved accuracy.
The output stream may be a plain text stream or file of characters, but more sophisticated
OCR systems can preserve the original layout of the page and produce, for example, an
annotated PDF that includes both the original image of the page and a searchable textual
representation
1.3 PLATFORM USED
1.3.1 MATLAB:
Listed below are some suggestions on how to use the MATLAB environment to effectively create
MATLAB programs.
Write scripts and functions in a text editor and save them as M-files. This will save time,
save the code, and greatly facilitate the debugging process, especially if the MATLAB
Editor is used.
Use the Help files extensively. This will minimize errors caused by incorrect syntax and by
incorrect or inappropriate application of a MATLAB function.
Attempt to minimize the number of expressions comprising a program. This leads to a tradeoff
between readability and compactness; it can encourage the search for MATLAB
functions and procedures that can perform some of the programming steps faster and more
directly.
When practical, use graphical output as a program is being developed. This usually shortens
the code development process by identifying potential coding errors and can facilitate the
understanding of the physical process being modeled or analyzed.
Most importantly, verify by independent means that the output from the program is correct.
5
1.3.2 Graphical User interface (GUI):
It is a type of interface that allows users to interact with electronic devices through
graphical icons and visual indicators such as secondary notation, as opposed to text-based
interfaces, typed command labels or text navigation. GUIs were introduced in reaction to the
perceived steep learning curve of command-line interfaces (CLIs), which require commands to be
typed on the keyboard. Typically, the user interacts with information by manipulating
visual widgets that allow for interactions appropriate to the kind of data they hold. The widgets of a
well-designed interface are selected to support the actions necessary to achieve the goals of the user.
Its ability to be easily customized allows the user to select or design a different skin at will, and
eases the designer's work to change the interface as the user needs evolve.
2.5 ASPECTS OF COLORS
1. Hue: Hue of a color represents the color or shade itself.
2. Brightness: Brightness of a color gives the intensity of each pixel value. More is the intensity,
more is the brightness.
3. Saturation: Saturation of a color gives the measure of white content in any shades.
4. Contrast: Contrast is defined as the separation between the darkest and brightest areas of the
image.
CONCLUSION
The project has been implemented and it concludes that the characters can be recognized if the
image is correctly aligned and properly executed. The recognized characters are displayed in
separate window where it is in the text format. The region of interest and and the background of
the image must have optimum color difference. The system recognizes the exact character
using feature matching between the extracted character and the template of all characters as a
measure of similarity. This project has a number of applications but most useful applications in
the real world are for banking solutions.\
OPTICAL CHARACTER RECOGNITION
1.1 INTRODUCTION
Optical character recognition, usually abbreviated to OCR, is the mechanical or electronic
conversion of scanned images of handwritten, typewritten or printed text into machine-encoded
text. It is widely used as a form of data entry from some sort of original paper data source, whether
documents, sales receipts, mail, or any number of printed records. It is a common method of
digitizing printed texts so that they can be electronically searched, stored more compactly, displayed
on-line, and used in machine processes such as machine translation, text-to-speech and text mining.
All OCR systems include an optical scanner for reading text, and sophisticated software for
analyzing images. Most OCR systems use a combination of hardware (specialized circuit boards)
and software to recognize characters, although some inexpensive systems do it entirely through
software. Advanced OCR systems can read text in large variety of fonts, but they still have
difficulty with handwritten text.
The potential of OCR systems is enormous because they enable users to harness the power of
computers to access printed documents. OCR is already being used widely in the legal profession,
where searches that once required hours or days can now be accomplished in a few seconds.
OCR is a field of research in pattern recognition, artificial intelligence and computer vision.
Early versions needed to be programmed with images of each character, and worked on one font at
a time. "Intelligent" systems with a high degree of recognition accuracy for most fonts are now
common. Some systems are capable of reproducing formatted output that closely approximates the
original scanned page including images, columns and other non-textual components.
2
1.2 COMPONENTS OF OCR SYSTEM
Figure 1.1: Block Diagram of OCR
1. Optical Scanning:
In OCR, optical scanners are used, which generally consist of a transport mechanism plus a
sensing device that converts light intensity into gray-levels.
Printed documents usually consist of black print on a white background.
So, we convert the multilevel image into a bi-level image of black and white. This process is
known as thresholding (performed on the scanner to save memory space and computational
effort).
A fixed threshold is used, where gray-levels below this threshold is said to be black and
levels above are said to be white.
For a high-contrast document with uniform background, a prechosen fixed threshold can be
sufficient.
However, a lot of documents encountered in practice have a rather large range in contrast.
In these case we usually use a method is able to vary the threshold over the document
adapting to the local properties as contrast and brightness.
However, such methods usually depend upon a multilevel scanning of the document which
requires more memory and computational capacity.
3
Therefore such techniques are seldom used in connection with OCR systems, although they
result in better images.
2. Location Segmentation:
Segmentation - Process that determines the constituents of an image.
It is necessary to locate the regions of the document where data have been printed and
distinguish them from figures and graphics.
For instance, when performing automatic mail-sorting, the address must be located and
separated from other print on the envelope.
The majority of optical character recognition algorithms segment the words into isolated
characters which are recognized individually.
Usually this segmentation is performed by isolating each connected component, that is each
connected black area
3. Pre processing:
Image resulting from the scanning process may contain a certain amount of noise.
Depending on the resolution on the scanner and the success of the applied technique for
thresholding, the characters may be smeared or broken.
Some of these defects can be eliminated by using a preprocessor to smooth the digitized
characters.
The smoothing implies both filling and thinning.
Filling eliminates small breaks, gaps and holes in the digitized characters, while thinning
reduces the width of the line
4. Feature Extraction:
To capture the essential characteristics of the symbols.
The most straight forward way of describing a character is by the actual raster image.
Another approach is to extract certain features that still characterize the symbols, but leaves
out the unimportant attributes.
Techniques: The distribution of points, Transformations and series expansions, Structural
analysis.
4
5. Post-processing:
OCR accuracy can be increased if the output is constrained by a lexicon – a list of words
that are allowed to occur in a document. This might be, for example, all the words in the
English language, or a more technical lexicon for a specific field. This technique can be
problematic if the document contains words not in the lexicon, like proper nouns. Tesseract
uses its dictionary to influence the character segmentation step, for improved accuracy.
The output stream may be a plain text stream or file of characters, but more sophisticated
OCR systems can preserve the original layout of the page and produce, for example, an
annotated PDF that includes both the original image of the page and a searchable textual
representation
1.3 PLATFORM USED
1.3.1 MATLAB:
Listed below are some suggestions on how to use the MATLAB environment to effectively create
MATLAB programs.
Write scripts and functions in a text editor and save them as M-files. This will save time,
save the code, and greatly facilitate the debugging process, especially if the MATLAB
Editor is used.
Use the Help files extensively. This will minimize errors caused by incorrect syntax and by
incorrect or inappropriate application of a MATLAB function.
Attempt to minimize the number of expressions comprising a program. This leads to a tradeoff
between readability and compactness; it can encourage the search for MATLAB
functions and procedures that can perform some of the programming steps faster and more
directly.
When practical, use graphical output as a program is being developed. This usually shortens
the code development process by identifying potential coding errors and can facilitate the
understanding of the physical process being modeled or analyzed.
Most importantly, verify by independent means that the output from the program is correct.
5
1.3.2 Graphical User interface (GUI):
It is a type of interface that allows users to interact with electronic devices through
graphical icons and visual indicators such as secondary notation, as opposed to text-based
interfaces, typed command labels or text navigation. GUIs were introduced in reaction to the
perceived steep learning curve of command-line interfaces (CLIs), which require commands to be
typed on the keyboard. Typically, the user interacts with information by manipulating
visual widgets that allow for interactions appropriate to the kind of data they hold. The widgets of a
well-designed interface are selected to support the actions necessary to achieve the goals of the user.
Its ability to be easily customized allows the user to select or design a different skin at will, and
eases the designer's work to change the interface as the user needs evolve.
2.5 ASPECTS OF COLORS
1. Hue: Hue of a color represents the color or shade itself.
2. Brightness: Brightness of a color gives the intensity of each pixel value. More is the intensity,
more is the brightness.
3. Saturation: Saturation of a color gives the measure of white content in any shades.
4. Contrast: Contrast is defined as the separation between the darkest and brightest areas of the
image.
CONCLUSION
The project has been implemented and it concludes that the characters can be recognized if the
image is correctly aligned and properly executed. The recognized characters are displayed in
separate window where it is in the text format. The region of interest and and the background of
the image must have optimum color difference. The system recognizes the exact character
using feature matching between the extracted character and the template of all characters as a
measure of similarity. This project has a number of applications but most useful applications in
the real world are for banking solutions.\
Hollanda yurtdışı kargo
ReplyDeleteİrlanda yurtdışı kargo
İspanya yurtdışı kargo
İtalya yurtdışı kargo
Letonya yurtdışı kargo
Kİİ
Portekiz yurtdışı kargo
ReplyDeleteRomanya yurtdışı kargo
Slovakya yurtdışı kargo
Slovenya yurtdışı kargo
İngiltere yurtdışı kargo
AFK7NP
Angila yurtdışı kargo
ReplyDeleteAndora yurtdışı kargo
Arnavutluk yurtdışı kargo
Arjantin yurtdışı kargo
Antigua ve Barbuda yurtdışı kargo
LH8K
افضل شركة تسليك مجاري بالاحساء CFDoNBaMaO
ReplyDelete