Setting Up Home Assistant Voice Assistant With Local Mic/Speaker

I spent quite a bit of time trying to set this up, so I thought I’d write about how to set it up and most guides assume you’re running haOS.

I have a Home Assistant instance running on a RaspberryPi 5 / docker, not haOS, and I wanted to set it up to use my local microphone and speakers.

This guide assumes you’re already running home assistant in a docker container.
Before anything, make sure you’re ssh’ed into your HA instance.

Setup the Wyoming containers

Make the directory if it doesn’t exist, mkdir -p ~/docker/wyoming.
Save this file to ~/docker/wyoming/compose.yaml.

services:
  speech-to-phrase:
    image: rhasspy/wyoming-speech-to-phrase:latest
    container_name: speech-to-phrase
    restart: unless-stopped
    volumes:
      - ./models:/models
      - ./train:/train
    ports:
      - "10300:10300"
    environment:
      - TZ=US/Central
    command: >
      --hass-websocket-uri 'ws://public ha ip, ex 192.168.0.222:8123/api/websocket' # change local host to your HA ip if it's not on local host
      --hass-token 'token from Profile -> Security -> Long-lived access tokens'
      --retrain-on-start

  piper:
    image: rhasspy/wyoming-piper:latest
    container_name: piper
    restart: unless-stopped
    volumes:
      - ./piper_data:/data
      - /sys/class/drm/
    ports:
      - "10200:10200"
    command: --voice en_US-amy-low

  openwakeword:
    image: rhasspy/wyoming-openwakeword:latest
    container_name: openwakeword
    restart: unless-stopped
    volumes:
      - ./openwakeword:/data
      - ./openwakeword:/custom
    ports:
      - "10400:10400"

	# set that threshold to higher than 0.5 (ex 0.75) for more accurate but less sensitive wake word detection
    command: --custom-model-dir /custom --preload-model 'hey_jarvis' --threshold 0.5

Run docker compose up -d to start the containers.
In HA, go to Settings -> Devices & Services -> Add Integration -> Wyoming Protocol.
Set host to localhost and port 10300.
Repeat for ports 10200 and 10400
Go to Settings -> Voice Assistants -> Add Assistant and give it a name.
Under Speech To Text, select speech-to-phrase.
Under Text To Speech, select piper.
Click the 3 dots at the top -> Add streaming wake word and select openwakeword.
Hit Create.

Setting up Linux Voice Assistant

Install required deps: sudo apt install libportaudio2 build-essential libmpv-dev pulseaudio.
Setup pulseaudio to auto start with your user: systemctl enable --now --user pulseaudio.
Clone the LVA repo: cd $HOME && git clone https://github.com/OHF-Voice/linux-voice-assistant.git.
cd linux-voice-assistant.
Run the setup script: ./script/setup.
Activate the python env: source .venv/bin/activate.
Find out your input device: python3 -m linux_voice_assistant --name LVA --list-input-devices, ex: Webcam Vitade AF Analog Stereo.
Find out your output device: python3 -m linux_voice_assistant --name LVA --list-output-devices.
Run LVA with python3 -m linux_voice_assistant --name LVA --audio-input-device 'Webcam Vitade AF Analog Stereo' --audio-output-device 'Autoselect device'.
Go to HA, Settings -> Devices & Services -> Add Integration -> ESPHome, it should automatically detect LVA.
Go through the setup process.
Enjoy

Custom Wake Words

Download the wake word you want (tflite file), for example from home-assistant-wakewords-collection and put in ~/docker/wyoming/openwakeword/.
- With curl:

cd  ~/docker/wyoming
sudo curl -L -o ./openwakeword/Skynet.tflite https://github.com/fwartner/home-assistant-wakewords-collection/raw/refs/heads/main/en/skynet/Skynet.tflite

- With git:

cd  ~/docker/wyoming
git clone https://github.com/fwartner/home-assistant-wakewords-collection/
sudo cp home-assistant-wakewords-collection/en/skynet/Skynet.tflite ./openwakeword/

Edit the docker/wyoming/compose.yaml file to change the --preload-model 'hey_jarvis' to --preload-model 'Skynet'.
Restart the openwakeword container: docker compose restart openwakeword.
In HA, go to Settings -> Voice Assistants -> Your Assistant -> Edit Streaming Wake Word and change the wake word to Skynet.

Use faster-whisper instead of speech-to-phrase

Replace:

  speech-to-phrase:
    image: rhasspy/wyoming-speech-to-phrase:latest
    container_name: speech-to-phrase
    restart: unless-stopped
    volumes:
      - ./models:/models
      - ./train:/train
    ports:
      - "10300:10300"
    environment:
      - TZ=US/Central
    command: >
      --hass-websocket-uri 'ws://public ha ip, ex 192.168.0.222:8123/api/websocket' # change local host to your HA ip if it's not on local host
      --hass-token 'token from Profile -> Security -> Long-lived access tokens'
      --retrain-on-start

With:

  whisper:
    container_name: whisper
    image: rhasspy/wyoming-whisper:latest
    restart: unless-stopped
    ports:
      - "10300:10300"
    volumes:
      - ./whisper-data:/data
	# change tiny-int8 with base-int8 if your computer can handle it
    command: --model tiny-int8 --language en

Changelog

2026-01-01: Added instructions for custom wake words.
2026-01-01: Added instructions to use faster-whisper instead of speech-to-phrase.

TODO

Create a docker file for LVA since right now it requires being run from the terminal.

Setup the Wyoming containers#

Setting up Linux Voice Assistant#

Custom Wake Words#

Use faster-whisper instead of speech-to-phrase#

Changelog#

TODO#