How to use Spectra
Spectra sees your screen, speaks what matters, and acts on your voice command. No reading. No staring. No typing. Just talk.
1Getting started
Press Q or say "Hey Spectra" to start.
Allow microphone access when your browser asks. Screen sharing is optional, enable it with W for screen-aware responses.
Spectra will say "Connected" when ready.
2Cloud mode (Gemini Live, default)
Cloud mode is the default at spectra.aqta.ai. It is powered by Gemini Live, Google's real-time multimodal API, which lets Spectra see your screen, hear your voice, and respond in speech simultaneously, with no perceptible delay.
No setup required. Open the site, press Q, allow microphone access, and you are connected.
What Gemini Live enables
- Real-time screen understanding — Press W to share your screen. Gemini processes the live video stream, so you can ask "what's on screen?" or "click the submit button" and it responds to exactly what it sees.
- Native voice in, voice out — Audio goes directly to Gemini Live. No intermediate transcription step, responses feel instant.
- Screen memory — Say "Remember this" to save a snapshot. Later, ask "What changed?" and Spectra compares the current screen to the saved one.
- Guided tours — Say "Teach me this app" for a narrated walkthrough of whatever is on screen.
- Interruption — Say "Stop" at any point during a response. Gemini Live handles barge-in natively so audio cuts off immediately.
Screen and audio are streamed to the Gemini API only while your session is active and are never stored. See the privacy policy for details.
3Local mode (offline, no API key)
Switch to Local in the mode toggle at the top of the page. In local mode, everything runs on your own machine, no data leaves your device.
Local mode uses Gemma 4 via Ollama for language, Whisper for voice recognition, and Piper for speech. You need to run these locally before connecting:
- Install Ollama and pull the model:
ollama pull gemma4
- Start Ollama:
ollama serve - Run Spectra locally:
./run.sh - Select Local in the mode toggle and press Q.
Local mode is text-only, no screen sharing. Browser control requires the Spectra Bridge extension to be connected. The hosted site at spectra.aqta.ai runs online mode only; local mode requires running Spectra on your own machine.
4Voice commands
5Keyboard shortcuts
6Spectra Bridge extension
To control other tabs, clicking, typing, and scrolling on any website, install the Spectra Bridge Chrome extension. Download it from the Spectra site, then:
- Open chrome://extensions
- Enable Developer mode (top-right toggle)
- Click Load unpacked
- Select the downloaded spectra-bridge folder
Without the extension, Spectra can still answer questions and respond by voice.
7Privacy
In online mode, Spectra processes data only while a session is active. Screen and voice are streamed to the Gemini API in real time and never stored. When your session ends, everything is discarded. No accounts, no tracking, no analytics.
In local mode, nothing leaves your device at all. All processing, speech recognition, language understanding, and voice output, happens on your own hardware.
8Tips
Speak naturally, Spectra understands conversational language, not just commands.
Say "Stop" at any time to interrupt.
Spectra confirms before destructive actions like deleting or sending.
If Spectra cannot find an element, it will scroll and try again automatically.
In local mode, keep requests concise, Gemma performs best with short, direct instructions.