At Speechmatics, our relentless pursuit of innovation ensures unparalleled accuracy and broad language inclusivity, redefining industry standards.
Self-Supervised Learning
Expanding accuracy through innovation
Enabling autonomous learning to enhance our understanding
At Speechmatics, self-supervised learning (SSL) serves as a transformative approach in training speech models, harnessing unlabeled data to enhance our speech recognition systems.
This technique allows us to autonomously identify patterns in vast amounts of data, significantly expanding the diversity of speech variations our models can learn from and improving accuracy across multiple languages.
Tirelessly pushing speech technology forward...
Throughout the years, Speechmatics has remained at the forefront of speech recognition research and innovation.
We are consistently pushing the boundaries of what's possible in speech-to-text technology.
2015
Our groundbreaking research, as detailed in the paper Scaling Laws for Neural Language Models, explores how the performance of neural language models improves predictably with scale. By analyzing models ranging from small to very large, we established key scaling laws that guide the efficient design and training of advanced language models, leading to significant enhancements in their accuracy and capabilities.
Our published research
Bias-Augmented Consistency Training Reduces Biased Reasoning in Chain-of-Thought
James Chua, Edward Rees, Hunar Batra, Samuel R. Bowman, Julian Michael, Ethan Perez, Miles Turpin. March 8, 2024.
While chain-of-thought prompting (CoT) has the potential to improve the explainability of language model reasoning, it can systematically misrepresent the factors influencing models' behavior--for example, rationalizing answers in line with a user's opinion without mentioning this bias. To mitigate this biased reasoning problem, we introduce bias-augmented consistency training (BCT), an unsupervised fine-tuning scheme that trains models to give consistent reasoning across prompts with and without biasing features.Read the paper in full
Debating with More Persuasive LLMs Leads to More Truthful Answers
Akbir Khan, John Hughes, Dan Valentine, Laura Ruis, Kshitij Sachan, Ansh Radhakrishnan, Edward Grefenstette, Samuel R. Bowman, Tim Rocktäschel, Ethan Perez. February 9, 2024.
Common methods for aligning large language models (LLMs) with desired behaviour heavily rely on human-labelled data. However, as models grow increasingly sophisticated, they will surpass human expertise, and the role of human evaluation will evolve into non-experts overseeing experts. In anticipation of this, we ask: can weaker models assess the correctness of stronger models?Read the paper in full
Will Williams, Sam Ringer, Tom Ash, John Hughes, David MacLeod, Jamie Dougherty. February 19, 2020.
Speechmatics’ paper was submitted and accepted to the most prestigious ML conference – NeurIPs. The paper is about a type of lossy image compression algorithm based on discrete representation learning, leading to a system that can reconstruct images of high-perceptual quality and retain semantically meaningful features despite very high compression rates.Read the paper in full
Texture Bias Of CNNs Limits Few-Shot Classification Performance
Remi Francis, Tom Ash, Will Williams. November 8, 2020.
In this paper, Speechmatics demonstrates how you could improve recurrent neural network language models by optimizing for downstream speech recognition accuracy directly, rather than the usual generative approach which tries to model the probability of the next word in a sequence.Read the paper in full
The Speechmatics Parallel Corpus Filtering System for WMT18
Tom Ash, Remi Francis, Will Williams. Machine Translation (WMT) October 31 – November 1, 2018.
Speechmatics published the paper at Workshop on Statistical Machine Translation (WMT) 2018 and presented a translation proof of concept.Read the paper in full
Franck Dernoncourt, Trung Bui, Walter Chang. Adobe Research. Interspeech 2018
At Interspeech 2018 in Hyderabad Speechmatics referred to as one of the most accurate providers of ASR after some evaluations, such as one done by Adobe Research. We demonstrated that our continued focus on innovation and to drive new R&D maintains our position in a growing and increasingly challenging field.Read the paper in full
W. Williams, N. Prasad, D. Mrva, T. Ash, A.J. Robinson. ICASSP 2015. February 2, 2015.
This is the first paper that shows that recurrent net language models scale to give very significant gains in speech recognition and it describes the most powerful models to date and some of the special methods needed to train them.Read the paper in full
One billion word benchmark for measuring progress in statistical language modeling
C. Chelba, T. Mikolov, M. Schuster, Q. Ge, T. Brants, P. Koehn, A.J. Robinson. Interspeech 2014. December 10, 2013.
This paper with Google presents a standard large benchmark so that progress in language modeling may be measured. Prior to this paper there was no open, freely available corpus that was large enough to be representative for modern language modeling tasks.Read the paper in full
Connectionist Speech Recognition of Broadcast News
A.J. Robinson, D. Abberley, D. Kirby, and S. Renals. Proceedings of the European Conference on Speech Technology. volume 3, pages 1267–1270, September 1999.
Here we show that speech recognition can be used to find information in audio in much the same way that web pages can be found with a search engine.Read the paper in full
Time-First Search for Large Vocabulary Speech Recognition
A.J. Robinson and J. Christie. ICASSP, pages 829–832, 1998.
Here we fundamentally change the main mechanism in speech recognition to make it both faster and more memory efficient (also US patent 5983180).Read the paper in full
Forward-Backward Retraining of Recurrent Neural Networks
A. J. Robinson. PhD thesis, Cambridge University Engineering Department, February 1989.
This PhD thesis introduces several key concepts of recurrent networks, several different novel architectures, the algorithms needed to train them and applications to speech recognition, coding, and reinforcement learning/game playing.Read the paper in full
We are actively seeking talented individuals to join our collective team of ambitious, problem solvers and throught-leaders, paving the way for inclusion in speech recognition technology.