Unveiling A New Visual User Interface For Google Cloud’s Speech-to-Text API

February 9, 2022

2 min read

At Google Cloud, we’re committed to making artificial intelligence (AI) accessible to everyone and easier to harness for new use cases. That’s why we’re excited to announce the general availability of our intuitive, new visual user interface for Google Cloud’s Speech-to-Text (STT) API, right in Google Cloud Console, which makes the API much simpler and easier for developers to use.

The STT API lets developers convert speech into text by leveraging Google’s years of research in automatic speech recognition and transcription technology. As advancements in AI continue to bring speech to new interfaces and devices, the STT API helps developers add speech functionality to their applications in order to better meet consumer demands.

The STT API covers a wide variety of use cases, from dictation and short commands, to captioning and subtitles. Getting the most of STT, however, can be a complicated process. To achieve the highest accuracy on any AI use case requires careful testing and tuning.

Previously, developers building on the STT API had to do this work manually by carefully experimenting with our API. Just to get started, developers needed familiarity with GCP integration concepts and had to either build their own tools or manage various scripts and API calls to fully understand the API documentation. These actions required cumbersome and time-consuming effort and made measuring, customizing, and improving models even more difficult.

Today’s announcement significantly simplifies the process, facilitating iteration and integration of models into developers’ applications by letting developers perform every API function from within the Google Cloud Console. These tools will make it easier for developers to integrate the STT API with their products or services. This update also gives developers the ability to manage and quickly iterate on their STT model customizations with Model Adaptation.

Model Adaptation allows developers to customize STT specifically for their domains or use cases. Developers can maintain lists of words and weights that will be applied to either every request or just single requests, depending on their needs. Model adaptations are reusable and composable, so once developers have seen good results in the STT Cloud Console, they can deploy to their entire solution.

The Speech-to-Text Cloud Console and Model Adaptation API is available now in all Google Cloud regions and languages and is accessible to all GCP users with no additional cost to that of the underlying API usage. The STT API supports over 70 languages in 120 different local variants. If you’re a developer looking for an easy to use, easy to integrate, and high-quality STT experience, sign up for our free trial and try our new interface on your own datasets today!

By Calum Barnes Product Manager, Speaker ID, Google Cloud
Source Google Cloud