About That OpenAI Audio Chat Demo

Rob Pickering

14 May 2024 — 1 min read

AI in audio conversations

What the OpenAI announcement means for telephony and other audio only conversational AI: carry on building, things are accelerating.

Yesterdays OpenAI announcement majored on audio & video conversations. There was an impressive speech demo which is way ahead of what any of us working in this area can build right now from individual STT/inferencing/TTS components, no matter how hard we tune and optimise to solve hard problems in this pipeline.

Aside from that demo, the details aren't out yet of how OpenAI are approaching this as "We hope to bring this modality to a set of trusted testers in the coming weeks":

I guess we will have to wait to see what this looks like, but if they are doing this right with full duplex audio streaming into the model then this will be a transformative step change in building natural conversations on the telephone and in other audio only contexts.

Other platforms will follow suit. Google would be mad if they aren't developing better versions of this pipeline, Gemini is already multimedia and Google know a lot about streaming intent recognition done the old way.

So my advice: keep on building, focus on the applications, and keep the plumbing platform agnostic as it is about to get exponentially easier to build authentic conversations.

Kamailio UAC Registration Address Tracking and Lookup

Kamilio has the ability to initiate registrations to a remote SIP server using the UAC module. This is super helpful if you are running a SIP proxy and one of your upstreams wants you to register with them rather than the other way around and you want to make this

Picture of Harrison Stickle from Loughrigg Tarn in the English Lake District

Energy Costs of Voice AI

Someone asked me yesterday: Is Voice AI going to burn the planet? Profligate use of AI certainly contributes to emissions, especially if we don't put data centres in the right places, but I've worked through the case for Voice AI quantitatively so lets look at the

AI Presentation Agents

I did something a bit stupid for my talk at FOSDEM this year. Instead of taking a slide deck and presenting that like any normal person would, I decided to dogfood some tech I have been working on. Ultimately it was a disaster. Because I focussed on the tech, I

AI and the value of knowledge work

This tweet was probably pretty controversial back in mid 2021 but, with suitable qualification, the number of informed people that would now argue with it must be pretty small. I've been thinking quite a lot lately about how a society like the UK will probably deal with this

Read more

Kamailio UAC Registration Address Tracking and Lookup

Energy Costs of Voice AI

AI Presentation Agents

AI and the value of knowledge work