As far as I can tell, Google and all the other voice assistant providers are going to launch some LLM/GPT-based service soon. Google is shutting down their intent/slot-filling based integrations (and have been teasing LaMDA for a while), Amazon left the existing Alexa platform on life support, and who knows what’s going on inside Apple.
But we don’t have to wait for a big Big Tech launch to try it out! OpenAI offers free GPT models over an API, so we can just forward the transcripts from Google Assistant to GPT3. This is a super-easy way to try out the future of voice assistants before everyone else. It’s really just a matter of gluing together a few things:
- Actions on Google is the platform for developing and deploying conversational apps for the Google Assistant.
- Dialogflow is a startup Google bought that provides basically the same service as Actions on Google.
- We’ll be using this legacy service because the new Actions on Google has a “feature” terminating conversations not matching the intents we define, making it unsuitable for open-ended conversations.
- A Python webserver to cache the conversations and call the OpenAI API with the correct prompt.
- It must be publicly accessible over HTTPS. ngrok http 5000 solves this for you.
- OpenAI’s completion API to actually run the GPT model.
This recipe works similarly for Alexa Skill Kit. Just configure ASK to forward all requests to the Python webhook and adjust the parsing functions in main.py.
The no-code part: Actions on Google and Dialogflow
To create a Dialogflow agent for a Google Assistant action via the Actions on Google Console, follow these steps (subject to have changed slightly since OpenAI scraped the docs):
- Sign in to the Actions on Google Console with your Google account.
- Click on the “New Project” button to create a new project.
- Enter a name for your project and select your preferred language and country/region.
- Select “Custom” as the type of project you want to create.
- On the next page, click the link near the bottom:
- Click on the “Create Project” button to create your project.
- Once your project is created, click on the “Develop” tab in the navigation menu.
- Click on the “Actions” tab and then click on the “Add Your First Action” button.
- Select “Build a custom intent” and then click on the “Dialogflow” button.
- This will open the Dialogflow console within the Actions on Google Console. From here, you can create your Dialogflow agent and build your conversation flow.
Under “Intents”, import these three intents:
The Python server
Download the code at the link below and unzip it somewhere. Set an authorization token in main.py and put your OpenAI token (see the Account page of OpenAI Platform) in gpt.py.
Assuming you have Python 3 installed already, you should be able to start it with these commands:
pip3 install -r requirements.txt
flask --app main --debug run
Once you’ve managed to start it and access it over https (again, use ngrok if you’re not sure how to do this), point Dialogflow’s Fulfillment page to your Python server.
Putting it all together
Finally, locate the tiny link that syncs your settings to Actions on Google:
From there, click “Test” to start the simulator and check if everything is wired up correctly. There might be some additional setup required in Actions on Google, follow the prompts to set up things like invocation name, etc.
Eventually, you should be able to start a conversation with GPT. Avoid asking too simple questions; Google Assistant will take over the conversation if you say something matching their hardcoded small-talk.
To get this on your smart speaker, just fill out the required fields in the Deploy tab and invite yourself to an Alpha test of the action we just built. Welcome to the future of voice assistants!