Area of investment and support

Area of investment and support: Speech technology

Recognition, understanding and synthesis of human speech, using a range of techniques and focusing on how systems recognise and generate the sounds of language.

Partners involved:: Engineering and Physical Sciences Research Council (EPSRC)

The scope and what we're doing

Recognition, understanding and synthesis of human speech, using a range of techniques and focusing on how systems recognise and generate the sounds of language.

The strategy for this area recognises the importance of speech technology to data science and the development of intelligent interfaces.

We aim to have a research area which:

addresses the challenges identified for this area, encompassing speech modelling, speech recognition, text-to-speech synthesis and spoken dialogue systems
makes a significant contribution to data science in terms of the large-scale processing and understanding of multi-modal data, including text, audio and video
includes research and training that contributes to development of intelligent dialogue interfaces which will serve as ways of communicating between humans and systems
with the increased importance of the development of spoken dialogue systems, will lead to strengthened links between Speech technology and the Artificial intelligence (AI) technologies, Natural language processing (NLP) and Human-computer interaction research areas
continues to support research into assistive technologies (for example personalised speech synthesis and recognition of disordered speech) and speech and language therapy technology
includes a greater proportion of early-career researchers, to ensure the area’s longer-term health.

Researchers have the opportunity to play an important role in delivering the objectives of EPSRC priorities for the Information and communication technologies (ICT) theme – especially Data enabled decision making, People at the heart of ICT and Future intelligent technologies).

To maximise the impact of these contributions, Speech technology researchers should ensure effective communication with researchers in other contributing areas, such as AI technologies, NLP, Image and vision computing and Human-computer interaction.

Why we invest in this area

The UK has some of the world’s leading speech technology researchers, who form a small but strong community, as evidenced by publications in top journals and by conferences – identified by the Research Excellence Framework (REF) 2014 – and by the publication and maintenance of open-source software and open data used by the international community.

Many research challenges have opened up – including speech modelling, speech recognition, text-to-speech synthesis and spoken dialogue systems. The use of deep neural nets (machine learning) has been driving improvements to speech recognition systems. Mobile technologies (including very small devices without touchscreens) will become the main focus for interactive applications, where expressivity and multi-modality will become key requirements.

There has been growth in research carried out in areas combining speech technology with natural language processing (NLP), underpinned by artificial intelligence (AI) technologies. UK researchers are well-positioned for future growth in this field with the very strong NLP expertise which is present in this country. This provides the UK with a unique capability in an international context.

The UK’s strength at the interface of speech and language technologies and AI is evidenced by the very significant investment being made in the UK by major industry players (for example Amazon, Google, Facebook, Apple and Bloomberg), who have created or expanded UK-based research facilities and are heavily recruiting UK researchers with PhD or postdoctoral experience in speech technology, NLP and AI.

There is also a large demand among small companies for PhD and postdoctoral-level researchers in speech technology. Given the increase in industrial recruitment, there is a question-mark over whether there are enough students coming through the system – for both the academic and the industrial pipeline. There is also a danger that UK academic institutions are being depleted of a postdoctoral workforce, leading to risks to the future leadership of the area (which is currently very strong in the UK). This may threaten the UK’s international standing in speech technology.

View evidence sources used to inform our research strategies.

Past projects, outcomes and impact

Visualising our portfolio (VoP) is a tool for users to visually interact with the EPSRC portfolio and data relationships. Find out more about research area connections and funding for Speech technology.

Find previously funded projects on EPSRC’s funding application outcomes Tableau.

Last updated: 6 March 2025