← Back to Project Vault

Project · Completed

Jarvis: The Private Voice Assistant

A fully local, privacy-first alternative to cloud voice assistants where voice data stays on the home network.

Home AssistantProxmoxMac mini M1Raspberry PiopenWakeWord

Project Gallery

Watch The Build

Your Smart Speaker is Spying. I Built a Fix.

Build Guide

This is the architecture walkthrough from the video: a local voice assistant stack built from hardware that was already sitting around the house.

Prerequisites

  • Home Assistant running on a mini PC or Proxmox host
  • Mac mini M1 or another local machine for the LLM side
  • Raspberry Pi 4 for the physical voice satellite
  • Jabra speakerphone or another good USB speaker/microphone
  • openWakeWord, Wyoming satellite, Whisper, and Piper configured through Home Assistant

1) Map the voice assistant stack

The Raspberry Pi acts as the room device, Home Assistant handles the smart-home routing, and the Mac mini acts as the heavier local brain when a question needs more than a simple automation.

2) Set up the Raspberry Pi as the satellite

Run the Pi as the always-on voice satellite with openWakeWord listening for the wake phrase and Wyoming satellite sending captured audio back into Home Assistant.

3) Keep speech processing local

Use Whisper for speech-to-text and Piper for text-to-speech so the voice loop can stay inside the home network instead of leaning on Siri or another cloud assistant.

4) Route commands before questions

Simple commands like turning lights or the TV on and off should be handled directly by Home Assistant. More open-ended questions can be routed to the Mac mini LLM and then spoken back through the Pi.

5) Mount the hardware where it can actually hear

The Jabra speakerphone was mounted with a 3D printed Raspberry Pi stand, powered over USB, and placed where it could hear from the kitchen or living room without needing to speak right into it.

6) Tune audio, wake word, and latency

The first working version still needed tuning: crackly audio had to be cleaned up, false wake-word triggers had to be reduced, and the LLM voice path still had a long pause compared with direct terminal use.

Results

  • Voice commands could control Home Assistant devices locally.
  • The Jabra speakerphone picked up voice clearly from across the room.
  • The local LLM path worked, but still had a 20-30 second delay through the full voice pipeline.
  • The build reused an M1 Mac mini, Raspberry Pi, Home Assistant, and 3D printed mounting hardware.
PieceRole
Raspberry Pi 4Voice satellite running wake word and Wyoming satellite
Home AssistantSmart-home routing, speech-to-text, and text-to-speech hub
Mac mini M1Local LLM brain for open-ended questions
Jabra speakerphoneRoom microphone and speaker

Mistake Log

What Got Messy

The part of the build log where the clean version gets honest.

The speakerphone worked almost too well

What happened
The Jabra speakerphone picked up voice clearly across the room, but that sensitivity also caused false wake-word triggers.
Fix
Tuned the wake-word settings down inside Home Assistant until Jarvis stopped waking up at the wrong time.
Lesson
Great microphones are not automatically great smart-home microphones. Sensitivity has to match the room.

Audio came through crackly at first

What happened
The speaker itself was good, but the first voice responses came through rough and crackled enough to make the setup feel unfinished.
Fix
Adjusted the voice/audio settings until the response audio came through cleanly.
Lesson
Local voice assistants are a full audio pipeline, not just an AI model. Bad audio can make a good setup feel broken.

The LLM path still has a long pause

What happened
Responses were quick from the Mac mini terminal, but routing the same kind of question through Jarvis introduced a 20-30 second delay.
Fix
Kept the local pipeline working while marking latency as the next problem to chase across Home Assistant, Whisper/Piper, Wyoming, and the Mac mini LLM path.
Lesson
A fast model in the terminal does not guarantee a fast voice assistant. Every handoff adds time.

The hardware choice was overkill, but it worked

What happened
The conference speakerphone was more hardware than the project really needed, but it solved the room pickup and speaker clarity problem.
Fix
Mounted it with a 3D printed Raspberry Pi stand and reused old hardware already sitting around the house.
Lesson
Overkill is not always bad if it turns spare gear into a reliable build.

Build Notes

This project is a fully local, privacy-first alternative to cloud-based voice assistants. By hosting intelligence on-site, voice data never leaves the home network while still delivering modern AI response speeds.

Infrastructure is distributed for performance and reliability: a Proxmox VM running Home Assistant acts as the central command center; a Mac mini (M1, 16GB RAM) provides local LLM compute; and a Raspberry Pi serves as the physical interface in a custom 3D-printed mount.

Core technologies include local LLM integration for natural language understanding, openWakeWord for on-device 'Hey Jarvis' wake detection, and a Jabra Speak 4 for microphone pickup and response output.

Key benefits: near-zero latency from local processing, full data sovereignty for voice and smart-home telemetry, and sustainable reuse of existing hardware into a high-performance automation suite.