the paperless office
admin on November 21, 2010 in Archiving, OCR Information, PDF | No Comments »
PDF stands for “Portable Document Format”. Its name describes what it is; it is a format in which documents are portable, in their exact form, from any computer to another. The standardization behind PDFs is what keeps PDFs uniforms across different readers and across different computers and operating systems. Frequently, with the commonly used .doc and .docx files, formatting and other information may be changed or lost when the file is opened in a different word processor. The unchangeable PDF file can be more easily viewed.
Where does OCR come in?
Optical Character Recognition is a technology that creates text documents in a digital format from images containing the text. The process usually begins with scanning a physical document, converting the output into an image file. This image file is parsed by the OCR software and compared with strings of fonts and language scripts. The precise algorithms used to compare the strings are generally complicated, and frequently fall under the category of intellectual property. When sufficiently close matches are found, those are identified as text strings and are marked as such. OCR technology allows large amounts of paper to be scanned quickly and easily into editable formats and searched or printed or emailed electronically, saving immensely on many costs.
Tags: ocr, pdf, savings
admin on October 4, 2010 in Archiving, Data Mining, Document Management | No Comments »
Many newspapers have begun to publish their old editions and articles online, most notably the New York Times, although progress is slow. Some are done with, or without OCR. While the New York times is currently working with reCaptcha on digitizing their old editions, this is pure OCR, but a sort of human assisted OCR system, where words of low confidence are checked manually. Currently, only the article headlines are searchable, but the content of the older articles is only readable in image form. Other newspapers are also working on digitizing their archives, although those on smaller budgets and less name recognition often have more limited options.
The slow progress of newspaper digitization is largely because of the difficulty for modern OCR software to accurately read extremely old printed text, especially in newspapers, where ink is known to blur and smudge easily. Although a good suite of OCR software should be able to get the vast majority of the words and phrases correct, it is still far from perfect, and is not good enough for a reputable corporation to publish online.
Tags: archive, digitization, new york times, newspaper, ocr
admin on September 17, 2010 in OCR Information | No Comments »
Digital pens are tools which can capture the handwriting of the users, and can digitize the data which can then be uploaded to a computer and saved, edited, or displayed on the computer system. On-line OCR can be used to digitize the text itself, making digital pens an extremely useful and intuitive tool for digitization of information.
The pens are generally larger and have more features than styluses that are commonly used for PDA’s. This is typically in order to accommodate the electronics and wiring, communication (frequently via bluetooth or IR). Digital pens can be exclusively digital, while some models also serve as normal pens.
Tags: digital pen, ocr, on-line ocr
admin on September 7, 2010 in Document Management, Industries | No Comments »
OCR can be used to assist in the process of automating the mailrooms of large corporations. In fact, the USPS first began to adopt OCR technology as early as the 1960's. Even though the OCR technology of the time was rather error-prone and far from foolproof, it still added a good deal of efficiency to the processing of mail.
Despite the growth of digital communication technology, paper mail volumes continue to grow exponentially, largely stemming from business growth and an increasingly mobile workforce. Even relatively small companies often need to process tens of thousands of pieces of mail monthly. As a result, corporate mailrooms require a high degree of automation in order to keep up to date. Today’s organisations demand instant and accurate information and spend hundreds of billions of dollars annually converting the information they receive into useful data.
A digital mailroom can open all paper documents, scan them, determine who the proper recipient is of the piece of mail, and send them an email. This reduces the cost of moving paper around within the corporation, and allows for easy archiving of all received mail, whether in a digital or physical format. The small percentage of the mail the software cannot determine the proper recipient can be checked manually and delivered accordingly. The automatic processing and delivery of over 90% of the mail can greatly reduce mailroom costs at even relatively small corporations.
Importantly, if more than one person at a corporation is privy to some information received by mail, the automatic scanning and processing allows multiple recipients to receive what only entered the corporate headquarters as one single piece of paper. This automation further saves time and money, and reduces the risk of important documents being lost when they are still needed.
Tags: automation, digital, documents, invoices, mailroom
OCR in Healthcare
admin on August 30, 2010 in Data Mining, Industries | No Comments »
OCR is expanding rapidly into the field of healthcare. This has been partly spurred by recent federal programs and laws aimed at making all health records available to doctors online, with the hope of reducing the cost of paperwork, and increasing accessibility of information. It is widely believed that billions of dollars every year can be saved with the implementation of a robust electronic system.
ICR is not yet sufficiently advanced to read handwriting consistently, especially for sensitive documents such as prescriptions where a misplaced decimal can be fatal. This is especially true here, because prescriptions are often written by doctors who are in a rush, and are difficult even for people to read. However, OCR technology is ready for a lot of the forms and invoice processing necessary for the resolution of insurance claims and other payments.
Invoices can be scanned by the OCR software, and can be referenced across an index of the most common invoices received, so that over 95% of invoices can be processed automatically. The remaining few documents will often have to be processed by hand, at least for the first few invoices from each less common vendor. Machine learning technology can be implemented to “teach” the software new templates as they are processed manually, saving time in the future, the next time that vendor sends an invoice.
+ + +
dancing bear videos
hot wife rio
my friends hot mom
my first sex teacher
cams for free