Let’s face it: in the globalized world, which is now more than ever a digital demand world, you need to scale and reach your customers right where they’re at. Translation is a critical piece of that, whether you’re translating a website in multiple languages or releasing a document, a piece of software, or training materials.
Manual translation does not scale, which is why machine translation, powered by machine learning (ML), is becoming more important to our customers. Machine translation has historically been challenging because of the sheer volume and breadth of content that can add value when translated into multiple languages. Companies acquire and share content in many languages and formats, and scaling translation to meet needs is a tall order due to multiple document formats, integrations with optical character recognition (OCR), and the need to correct for domain-specific terminology.
Our goal is to simplify translation services, while enabling flexibility and control for our customers’ unique needs across industries. Read on to learn more about recent features and updates.
Formatting matters: Document Translation is now GA
In many cases, the layout of a document dictates how it should be interpreted—e.g., readers navigate text and discern meaning based on formatting, like bold or italicized text, or markups for headers, paragraphs, and columns. Previously, to automate translation of documents, text needed to be separated from these layout attributes, meaning the document’s structure was either lost or needed to be recreated later in the developer pipeline, after the text had been translated. This required translation teams to do a lot of extra work and maintain a lot of additional code. But now, those steps are unnecessary. Formatting can be retained throughout the translation process, handled directly by the Translation API Advanced.
This feature lets customers translate documents in 100+ languages and supports document types such as Docx, PPTx, XLSx, and PDF while preserving document formatting.
And if your needs go beyond Document Translation, we can help you translate audio as well. For real-time streaming translation, check out the Media Translation API, and for offline transcription translation, combine the Translation API with the Video Intelligence API.
Real-Time translation when you need it, Batch when you don’t
One of the biggest differentiators for Translation API Advanced’s document translation capabilities is the ability to do real-time, synchronous processing for a single file.
For example, if you are translating a business document such as HR documentation, online translation provides flexibility for smaller files and provides faster results. You can easily integrate with our APIs via REST or gRPC with mobile or browser applications, with instant access to 100+ language pairs so that content can be understandable in any supported language.
Meanwhile, batch translation allows customers to translate multiple files into multiple languages in a single request. For each request, customers can send up to 100 files with a total content size of up to 1 GB or 100 million Unicode codepoints, whichever limit is hit first.
State of the Art (SOTA) accuracy, with flexibility for customization
In order to achieve the highest level of accuracy for your translation, we now support multiple options:
- Use Google’s SOTA translation models: Each year, Google heavily invests to improve the quality of our translations across Apps, Cloud APIs, and Chrome, as well to enable multilanguage answers in Search. A popular metric for automatic quality evaluation of Machine translation systems is the BLEU score, which is based on the similarity between machine translation and the reference translations that were generated by people. While we push out incremental improvements for individual models on a monthly cadence, there are also times where we make significant leaps. In the releases since 2019, we have improved our average BLEU score by 5pts on average across 100+ languages and 7pts on low resource languages.
- Leverage glossaries for specific terms and phrases: Glossary is our terminology control feature. It allows you to import source content to define preferred translations, such as product names or department names. Then, when calling the glossary in the API request, your preferred translations will be enforced. This will work for words as well as phrase translation.
- Pick a pre-trained model with model selection: If you create custom models for machine translation, we don’t think you should have multiple client libraries and multiple APIs to maintain in order for you to use the best model for your needs. Translation API Advanced now supports Model Selection. Pick your pretrained model or pick your custom ML model built on AutoML for any language pair you’ve created and use the same API and the same client library.
- Build custom translation models with AutoML: AutoML Translation is a suite of ML products that enable you to build high quality models for your own use case or data, with limited-to-no ML expertise or coding required. Bring your past human-validated translations to improve translation specificity for your domain.
Keep localization local with Regional Endpoints
If you are a customer operating in the EU, we recently launched an endpoint specifically for EU regionalization. This is a configurable endpoint for customers to store and perform machine translation processing of customer data only in the EU multi region. For now, this only supports our pretrained translation models and glossary, but batch translations will be coming soon.
How Eli Lilly uses Cloud Translation to translate content globally
Historically, translations at Eli Lilly have been complicated: numerous translation vendors have been needed for different languages and organizations, all with their own processes and expectations. On top of that, translations have been costly and slow.
To solve this, Eli Lilly took a codified approach to enable users and systems to spend less time and resources to safely generate quality translations.
Learn more, and even catch a demo, from Thomas Griffin, Translation Tech Lead & Global Regulatory Architect for Eli Lilly.
By Sarah Weldon Product Manager for Google Cloud Translation
Source Google Cloud Blog