Convosense

Solving Conversational Voice Recognition

Speech is not a solved problem.

All applications will have some errors.
Key: design the system so some aspects are easy.

Voice Dialing

Siri

Amazon Echo

Meeting Transcription

Small vocabulary

Unlimited vocabulary

Quiet environment

Noisy environment

Near microphone

Distant microphone

Deliberate, careful speech

Spontaneous speech

Fixed speaker

Multiple speakers

Mission

Many of the biggest problems facing humanity today, like curing diseases or addressing climate change, would be vastly easier to solve with the help of AI. At Convosense we believe that we can channel this revolutionary technology to radically improve human communications and collaboration.

By making human discourse fully machine readable we can open a vast landscape of business, education and consumer opportunities.

Conversational Speech Recognition Is Not A Solved Problem.
It Will Be Soon ...

Massive Growth Market

Starting from a base of $249 million in 2015, global speech and voice biometrics revenue will reach $5.1 billion by 2024, with cumulative revenue for the 10-year period totaling $19 billion at a compound annual growth rate (CAGR) of 40%. Enterprise growth markets are expected to include call centers, government IT, enterprise IT, and healthcare.

NIST STT Benchmark Test History - May 09

Via speakerphone:
80% Word Error Rate

Via individual headsets:
50% Word Error Rate

Solving The Problem Through Deep Learning Ai.

The Tipping Point Facilitated by Deep Learning.

Speech Recognition (Google Now): 30% reduction in Word Error Rate for English. Biggest single improvement in 20 years of speech research.

Voicemail transcriptions (Google Voise / Project Fi): Using a long short-term memory deep recurrent neural network the transcription errors was cut by 49%.

How To Reduce The Error Rate To Ca. 20%?
Three broad areas for improvements:

Acoustic Modeling.

Captures how various language sounds appear in audio.

Language Modeling.

Captures frequency information about various words and phrases.

Engine/Decoder.

Combines the audio & the above models to produce the best word sequences.

Contact Us

contact@convosense.com
+1 866 369 6329

603 Greenwich street, Suite 101 New York, NY, 10014
Check on Googe Map →