Be part of our every day and weekly newsletters for the most recent updates and unique content material on industry-leading AI protection. Study Extra
Gladia, an AI transcription and audio intelligence supplier, has raised $16 million in funding.
The Paris, France-based firm will use the funding to develop an end-to-end audio infrastructure – beginning with a brand new real-time audio transcription and analytics engine – enabling voice-first platforms to ship extra worth to their customers throughout borders with cutting-edge AI.
It’s a problem to rivals reminiscent of Otter.ai and Fireflies.ai, in addition to different AI-based providers that transcribe voice conversations to textual content. In an interview with VentureBeat, CEO Jean-Louis Quéguiner defined to me why he began the corporate.
“As you can hear from a beautiful French accent, I’m not an English speaker and I was extremely frustrated with the accents,” Quéguiner mentioned. “That’s why I founded the company.”
I acquired a demo of the AI transcription, and it labored in actual time as Quéguiner spoke English together with his heavy French accent. I’m used to providers like Otter getting a whole lot of phrases flawed in a transcription, however within the first web page of outcomes from Gladia, I noticed no errors. He additionally confirmed how he may communicate two completely different languages and the system may shift from one language to a different as wanted.
XAnge led the spherical, with participation by Illuminate Monetary, XTX Ventures, Athletico Ventures, Gaingels, Mana Ventures, Motier Ventures, Roosh Ventures, and Soma Capital.
Based in 2022, Gladia has now raised a complete of $20.3 million, with earlier seed investments headed by New Wave, Sequoia Capital (as a part of the First Sequoia Arc program), Cocoa, and GFC. Gladia just lately was chosen to take part within the AWS generative AI accelerator program.
“Gladia represents the qualities we like to champion at XAnge: a bold, global tech team at the forefront of AI innovation, with a proven business model to unlock new opportunities across industries,” mentioned Alexis du Peloux, accomplice at XAnge, in an announcement. “In a fast-paced AI environment, Jean-Louis Quéguiner and his team have executed extremely well, and we are proud to back Gladia for the Series A.”
Given that almost all speech recognition fashions right this moment are educated predominantly on English audio knowledge and are subsequently inherently biased, Gladia prioritized constructing the primary real-time product that’s actually multilingual.
The brand new fine-tuned engine delivers superior real-time transcription in over 100 languages, together with enhanced help for accents and the distinctive skill to adapt to completely different languages on the fly.
Gladia’s new engine is exclusive in its skill to extract insights from a name—just like the caller’s sentiment, key data, and dialog abstract—in real-time. This implies it takes lower than a second to generate each transcript and insights from a name or assembly utilizing Gladia.
New real-time AI transcription
Constructing an correct, low-latency, and multilingual engine in-house is a fancy and resource-intensive activity. It requires in depth experience in language understanding, real-time knowledge dealing with, with steady optimization and upkeep. Actual-time fashions require extra computing energy and should battle to supply correct output instantly because of restricted context.
Gladia’s new product permits firms to bypass these challenges. The actual-time speech-to-text engine boasts an industry-leading latency of underneath 300 milliseconds with out compromising accuracy, whatever the language, geography, or tech stack used.
“Companies are spending valuable time and resources trying to incorporate multiple AI functions into their existing platforms,” mentioned Jonathan Soto, CTO of Gladia, in an announcement. “Our single API is compatible with all existing tech stacks and protocols, including SIP, VoIP, FreeSwitch, and Asterisk. This allows us to easily integrate real-time transcription and analysis into our customers’ AI platforms, so they can focus on delivering the best services to their end users.”
What’s forward
The corporate’s first async transcription and audio intelligence API launched in June 2023 and was primarily based on a proprietary model of Whisper ASR.
It quickly gained traction within the enterprise market, notably with assembly recorders and note-taking assistants. The API is now adopted by over 600 clients all over the world, together with Consideration, Circleback, Methodology Monetary, Recall, Sana, and VEED.IO and has greater than 70,000 customers.
“Gladia’s technology allows companies in vertical markets that need cutting-edge real-time transcription, including sales enablement and contact center platform, to shift seamlessly from manual post-call processing to proactive, low-latency workflows,” Quéguiner mentioned. “Whether it’s automated CRM enrichment or real-time guidance for support agents, Gladia is designed to help businesses operate smarter and more efficiently in record time, without requiring AI expertise in-house.”
Gladia will use the brand new capital to advance its R&D efforts and shortly deliver to market a one-stop AI toolkit for audio and increase its product providing with extra à la carte fashions—together with massive language fashions (LLMs) and retrieval-augmented technology (RAG). With a number of design companions within the contact-center-as-a-service (CCaaS) section, the corporate is at present piloting an agent-assist resolution powered by Gladia’s real-time AI engine. Moreover, Gladia will proceed to increase its expertise base because it prepares for worldwide growth.
“We are multilingual, and we have something that is called ‘code switching,’ which makes it unique,” Quéguiner mentioned. “You can start with the language and switch to another.”
He went on to point out me that he may begin a name in English and provoke the transcription. Then he spoke French phrases, and the mannequin accurately translated it in French.
“Keep in mind that [others] are not real time right now, and this one is real time,” he mentioned. “Usually, real time is a little bit less accurate. You can also have your own custom vocabulary in real time, which is pretty unusual, with us. We have the capability to extract some real-time insights.”
The service has an AI summarizer, and it’ll have new non-compulsory options within the coming months. Quéguiner mentioned that his service can even get acronyms proper and detect the swap to a different language.
“The mannequin we use is similar to LLMs (massive language fashions). It has no code decoder structure, which isn’t the case for many of the fashions that you just’ve seen with Fireflies, for example.
The market contains “meeting recorders,” Quéguiner mentioned. The outcomes might be handed on to real-time insights, which will help individuals like gross sales leads shut offers quicker.
The corporate additionally works with Name Facilities, giving them 30% quicker time to completion when they’re on the cellphone thanks to higher accuracy. The corporate will cost a flat price reminiscent of a per-hour pricing.