1. Assess page-level quality of documents with Intelligent Document Quality (IDQ)
With Document AI OCR, Google Cloud customers and partners can programmatically extract key document characteristics – word frequency distributions, relative positioning of line items, dominant language of the input document, etc. – as critical inputs to their downstream business logic. Today, we are adding another important document assessment signal to this toolbox: Intelligent Document Quality (IDQ) scores.
IDQ provides page-level quality metrics in the following eight dimensions:
- Level of optical noise
- Presence of smaller-than-usual fonts
- Document getting cut off
- Text spans getting cut off
- Glares due to lighting conditions
Being able to discern the optical quality of documents helps assess which documents must be processed differently based on their quality, making the overall document processing pipeline more efficient. For example, Gary Lewis, Managing Director of lending and deposit solutions at Jack Henry, noted, “Google’s Document AI technology, enriched with Intelligent Document Quality (IDQ) signals, will help businesses to automate the data capture of invoices and payments when sending to our factoring customers for purchasing. This creates internal efficiencies, reduces risk for the factor/lender, and gets financing into the hands of cash-constrained businesses quickly.”
Overall, document quality metrics pave the way for more intelligent routing of documents for downstream analytics. The reference workflow below uses document quality scores to split and classify documents before sending them to either the pre-built Form Parser (in the case of high document quality) or a Custom Document Extractor trained specifically on lower-quality datasets.
2. Process digital PDF documents with confidence with built-in digital PDF support
The PDF format is popular in various business applications such as procurement (invoices, purchase orders), lending (W-2 forms, paystubs), and contracts (leasing or mortgage agreements). PDF documents can be image-based (e.g., a scanned driver’s license) or digital, where you can hover over, highlight, and copy/paste embedded text in a PDF document the same way as you interact with a text file such as Google Doc or Microsoft Word.
We are happy to announce digital PDF support in Document AI OCR. The digital PDF feature extracts text and symbols exactly as they appear in the source documents, therefore making our OCR engine highly performant in complex visual scenarios such as rotated texts, extreme font sizes and/or styles, or partially hidden text.
Discussing the importance and prevalence of PDF documents in banking and finance (e.g., bank statements, mortgage agreements, etc.), Ritesh Biswas, Director, Google Cloud Practice at PwC, said, “The Document AI OCR solution from Google Cloud, especially its support for digital PDF input formats, has enabled PwC to bring digital transformation to the global financial services industry.”
3. “Freeze” model characteristics with OCR versioning
As a fully managed cloud-based service, Document AI OCR regularly upgrades the underlying AI/ML models to maintain its world-class accuracy across over 200 languages and scripts. These model upgrades, while providing new features and enhancements, may occasionally lead to changes in OCR behavior compared to an earlier version.
Today, we are launching OCR versioning, which enables users to pin to a historical OCR model behavior. The “frozen” model versions, in turn, give our customers and partners peace of mind, ensuring consistent OCR behavior. For industries with rigorous compliance requirements, this update also helps maintain the same model version, thus minimizing the need and effort to recertify stacks between releases. According to Jaga Kathirvel, Senior Principal Architect at Mr. Cooper, “Having consistent OCR behavior is mission-critical to our business workflows. We value Google Cloud’s OCR versioning capability that enables our products to pin to a specific OCR version for an extended period of time.”
With OCR versioning, you have the full flexibility to select the versioning option that best fits your business needs.
Getting Started on Document AI OCR
Learn more about the new OCR features and tutorials in the Document AI Documentation or try it directly in your browser (no coding required). For more details on what’s new with Document AI, don’t forget to check out our breakout session from Google Cloud Next 2022.
By Steve Z., Product Manager | Devaki Kulkarni, Product Manager
Source Google Cloud