I spent quite a bit of time trying to set this up, so I thought I’d write about how to set it up and most guides assume you’re running haOS.

I have a Home Assistant instance running on a RaspberryPi 5 / docker, not haOS, and I wanted to set it up to use my local microphone and speakers.

  • This guide assumes you’re already running home assistant in a docker container.
  • Before anything, make sure you’re ssh’ed into your HA instance.

Setup the Wyoming containers

  • Make the directory if it doesn’t exist, mkdir -p ~/docker/wyoming.
  • Save this file to ~/docker/wyoming/compose.yaml.
services:
  speech-to-phrase:
    image: rhasspy/wyoming-speech-to-phrase:latest
    container_name: speech-to-phrase
    restart: unless-stopped
    volumes:
      - ./models:/models
      - ./train:/train
    ports:
      - "10300:10300"
    environment:
      - TZ=US/Central
    command: >
      --hass-websocket-uri 'ws://public ha ip, ex 192.168.0.222:8123/api/websocket' # change local host to your HA ip if it's not on local host
      --hass-token 'token from Profile -> Security -> Long-lived access tokens'
      --retrain-on-start

  piper:
    image: rhasspy/wyoming-piper:latest
    container_name: piper
    restart: unless-stopped
    volumes:
      - ./piper_data:/data
      - /sys/class/drm/
    ports:
      - "10200:10200"
    command: --voice en_US-amy-low

  openwakeword:
    image: rhasspy/wyoming-openwakeword:latest
    container_name: openwakeword
    restart: unless-stopped
    volumes:
      - ./openwakeword:/data
      - ./openwakeword:/custom
    ports:
      - "10400:10400"

	# set that threshold to higher than 0.5 (ex 0.75) for more accurate but less sensitive wake word detection
    command: --custom-model-dir /custom --preload-model 'hey_jarvis' --threshold 0.5
  • Run docker compose up -d to start the containers.
  • In HA, go to Settings -> Devices & Services -> Add Integration -> Wyoming Protocol.
  • Set host to localhost and port 10300.
  • Repeat for ports 10200 and 10400
  • Go to Settings -> Voice Assistants -> Add Assistant and give it a name.
  • Under Speech To Text, select speech-to-phrase.
  • Under Text To Speech, select piper.
  • Click the 3 dots at the top -> Add streaming wake word and select openwakeword.
  • Hit Create.

Setting up Linux Voice Assistant

  • Install required deps: sudo apt install libportaudio2 build-essential libmpv-dev pulseaudio.
  • Setup pulseaudio to auto start with your user: systemctl enable --now --user pulseaudio.
  • Clone the LVA repo: cd $HOME && git clone https://github.com/OHF-Voice/linux-voice-assistant.git.
  • cd linux-voice-assistant.
  • Run the setup script: ./script/setup.
  • Activate the python env: source .venv/bin/activate.
  • Find out your input device: python3 -m linux_voice_assistant --name LVA --list-input-devices, ex: Webcam Vitade AF Analog Stereo.
  • Find out your output device: python3 -m linux_voice_assistant --name LVA --list-output-devices.
  • Run LVA with python3 -m linux_voice_assistant --name LVA --audio-input-device 'Webcam Vitade AF Analog Stereo' --audio-output-device 'Autoselect device'.
  • Go to HA, Settings -> Devices & Services -> Add Integration -> ESPHome, it should automatically detect LVA.
  • Go through the setup process.
  • Enjoy

Custom Wake Words

cd  ~/docker/wyoming
sudo curl -L -o ./openwakeword/Skynet.tflite https://github.com/fwartner/home-assistant-wakewords-collection/raw/refs/heads/main/en/skynet/Skynet.tflite
    • With git:
cd  ~/docker/wyoming
git clone https://github.com/fwartner/home-assistant-wakewords-collection/
sudo cp home-assistant-wakewords-collection/en/skynet/Skynet.tflite ./openwakeword/
  • Edit the docker/wyoming/compose.yaml file to change the --preload-model 'hey_jarvis' to --preload-model 'Skynet'.
  • Restart the openwakeword container: docker compose restart openwakeword.
  • In HA, go to Settings -> Voice Assistants -> Your Assistant -> Edit Streaming Wake Word and change the wake word to Skynet.

Use faster-whisper instead of speech-to-phrase

Replace:

  speech-to-phrase:
    image: rhasspy/wyoming-speech-to-phrase:latest
    container_name: speech-to-phrase
    restart: unless-stopped
    volumes:
      - ./models:/models
      - ./train:/train
    ports:
      - "10300:10300"
    environment:
      - TZ=US/Central
    command: >
      --hass-websocket-uri 'ws://public ha ip, ex 192.168.0.222:8123/api/websocket' # change local host to your HA ip if it's not on local host
      --hass-token 'token from Profile -> Security -> Long-lived access tokens'
      --retrain-on-start

With:

  whisper:
    container_name: whisper
    image: rhasspy/wyoming-whisper:latest
    restart: unless-stopped
    ports:
      - "10300:10300"
    volumes:
      - ./whisper-data:/data
	# change tiny-int8 with base-int8 if your computer can handle it
    command: --model tiny-int8 --language en

Changelog

  • 2026-01-01: Added instructions for custom wake words.
  • 2026-01-01: Added instructions to use faster-whisper instead of speech-to-phrase.

TODO

  • Create a docker file for LVA since right now it requires being run from the terminal.