Hi,
I’m trying to determine whether this is a known limitation or a bug.
I created a minimal Custom GPT with a single GET Action using the following OpenAPI:
openapi: 3.1.0
info:
title: Test Voice API
version: 1.0.0
servers:
- url: https://httpbin.org
paths:
/get:
get:
operationId: testGet
responses:
"200":
description: OK
GPT instructions:
Whenever the user says “test”, always call the action
testGet.
Results
Text mode
Advanced Voice Mode (Android app and ChatGPT Web using a microphone)
-
The Action is never executed.
-
The assistant reports a technical issue.
-
No HTTP request reaches the server.
Additional observation:
If I type “continue” immediately after the failed voice interaction, ChatGPT is sometimes able to inspect its previous attempt and reports that it encountered:
InvalidRecipient
Unrecognized recipient: httpbin_org__jit_plugin
This suggests the failure may occur before any HTTP request is sent.
I first observed this with my own FastAPI backend, then reproduced it with the public httpbin.org endpoint, so it doesn’t appear to be API-specific.
Has anyone successfully executed Custom GPT Actions from Advanced Voice Mode recently?
If yes:
Thanks!
Welcome to the community and thanks for the detailed repro, @Nessper.
Based on the current Voice Mode documentation, this appears to be a limitation rather than a bug with your OpenAPI spec or backend. While Custom GPTs can be used in Voice Mode, custom GPT Actions are not currently supported in Voice Mode.
That would also explain why no request ever reaches your server. The InvalidRecipient message is interesting, but it likely reflects the internal failure when Voice Mode attempts something it doesn't currently support, rather than an issue with your API.
-Mark G.
Hi,
Thanks for the clarification.
Could you confirm whether Custom GPT Actions in Voice Mode are:
- intentionally not supported (by design / product decision)
or - temporarily not supported (work in progress / roadmap item)
In other words, is there any plan to enable full tool / Actions execution in Advanced Voice Mode for Custom GPTs in the future?
This is critical for understanding whether Voice Mode can be used as an interaction layer for action-based agents.
Thanks
Hi @Nessper,
Thanks for the thoughtful follow-up.
At the moment, Custom GPT Actions are not supported in Voice Mode by design. While we don't have any plans or timeline to share regarding support for Actions in Advanced Voice Mode, we're continuously improving our products based on user needs and feedback, so this could change in the future.
We appreciate you sharing your use case. It helps the team better understand how features like Voice Mode could be expanded over time.
-Mark G.
Thanks for the clarification.
To give more context, I am building a personal productivity assistant that relies on:
- Voice input (hands-free interaction)
- Structured tool use via Custom GPT Actions
- Real-time task execution (calendar, task management API)
The current limitation prevents using Voice Mode as a full interaction layer, which forces a fallback to manual text input.
I understand this is a design decision, but I wanted to highlight that this significantly impacts real-world “voice-first agent” use cases.
Is this type of voice-driven tool-calling workflow something you expect to support in future iterations of the product?
-Nessper
Thanks for sharing the additional context, @Nessper. That's a great example of a real-world voice-first workflow, and I can definitely see why the current limitation gets in the way.
While I can't make any promises about future features, we're not ruling out this type of capability. I've shared your feedback with the team so it can be logged, as voice-driven tool calling is a compelling use case and the extra detail you provided is really valuable.
Appreciate you taking the time to explain your scenario, and please keep sharing any other workflows or pain points you run into. That kind of context helps inform future improvements.
-Mark G.