HAL 9000

Name: HAL 9000
Author: Shandar Junaid

The world's first all-seeing, all-hearing, (almost) all-doing AI agent

The only AI agent with Claude Code living in its brain. A cross-platform, multimodal agent that sees, hears, thinks, speaks, and acts on your machine. Free & open source.

Claude Code. Embedded. Terminal. Memory. All inside one eye.

Get Started ↓ Interactive Tutorial ↓

Webcam VisionToken StreamingWhisper STTGPT-4oClaudeGeminiOllamaEdge TTSBrowser Mic43 Tools35 Slash CommandsMCP ServerTerminal UIKnowledge UploadWebcam VisionToken StreamingWhisper STTGPT-4oClaudeGeminiOllamaEdge TTSBrowser Mic43 Tools35 Slash CommandsMCP ServerTerminal UIKnowledge Upload

Capabilities

What I Can Do

Seven integrated subsystems working in concert — perception, cognition, voice, action, knowledge, memory, and collaboration.

HAL 9000 Control Panel — live dashboard

See

Webcam vision with HUD overlay — scanlines, corner brackets, real-time analysis. HAL observes your environment.

Hear

Click-to-speak mic with live waveform visualization, browser-native recording, silence detection, and Whisper transcription.

Think

Multi-provider LLM with real-time token streaming — GPT-4o, Claude, Gemini, or Ollama (free). Parallel tool execution, 43-tool agent layer.

Speak

Chunked sentence-level TTS — HAL speaks as it thinks. Three engines: Edge TTS (free), ElevenLabs (premium), XTTS (local cloning).

Act

43 OS-level tools — shell commands, file operations, app control, web search, clipboard, screenshots, AppleScript automation.

Learn

Drag-and-drop PDFs, docs, code, images — HAL extracts, chunks, and indexes them locally. BM25 search recalls knowledge on demand.

Co-Work

Background task runner, artifact workspace, multi-agent orchestration, and cross-agent context handoff with Claude Code.

Your Privacy. Non-Negotiable.

No data is recorded, stored, or sent anywhere. No telemetry, no analytics, no tracking. Webcam and voice can be toggled off from the dashboard. Microphone is click-only — HAL never listens without your explicit action. In free mode, nothing ever leaves your machine — not even API calls.

Camera

Toggleable

Mic

Click-Only

Voice

Toggleable

Knowledge UploadDrag & Drop LearningBM25 SearchPDF ExtractionChat Cursor AnimationSeparated CSS/JSCycling TipsKnowledge PanelDeep Read & Skim ModesKnowledge UploadDrag & Drop LearningBM25 SearchPDF ExtractionChat Cursor AnimationSeparated CSS/JSCycling TipsKnowledge PanelDeep Read & Skim Modes

New in v1.5

Teach HAL. Watch It Learn.

Knowledge upload system, drag-and-drop learning, chat cursor animations, and a cleaner codebase.

Knowledge Upload

Drag-and-drop files onto the chat window or use the Knowledge panel. HAL extracts text from PDFs, Word docs, spreadsheets, code, and images.

BM25 Search

Large files are chunked with overlap and indexed using BM25 — a probabilistic ranking function. HAL recalls the most relevant passages on demand.

Deep Read & Skim

Choose per file: Deep Read chunks and indexes for full recall. Skim loads small files directly into context for instant access. You decide.

Knowledge Panel

Collapsible sidebar listing all uploaded files with type, size, mode, and storage usage. Delete files, upload new ones — all from the dashboard.

Chat Cursor

HAL's responses show a blinking terminal cursor that pulses as text appears, then fades out — giving the chat a live, cinematic feel.

Cycling Tips

Rotating hints at the bottom of chat teach you features as you use HAL — slash commands, voice input, knowledge uploads, and more.

Architecture

How It Works

Seven subsystems connected through a multi-provider brain with a 43-tool agent layer and local knowledge index.

Vision Webcam + CV

→

Hearing VAD + Whisper

→

Brain LLM + Tools

→

Voice 3 TTS Engines

→

Tools 43 Agent Tools

Background TasksMulti-AgentTyped MemoryArtifact WorkspaceSession HandoffContext TransferClaude CodeOrchestrationConflict DetectionTask QueueBackground TasksMulti-AgentTyped MemoryArtifact WorkspaceSession HandoffContext TransferClaude CodeOrchestrationConflict DetectionTask Queue

Co-Work Platform

Your AI Operations Hub

HAL coordinates work across itself, Claude Code CLI, and Claude Desktop — with shared memory, background tasks, and multi-agent orchestration.

Typed Memory & Context Handoff

Memories are categorized — facts, decisions, preferences, session summaries. When a Claude Code session ends, context is distilled and pushed to HAL's persistent store. Next session loads where you left off.

hal_save_session hal_get_context auto-summarize

Background Task Runner

Submit long-running coding tasks that execute asynchronously via Claude Code. Real-time progress, configurable concurrency, and cancellation. HAL keeps working while tasks run.

background_task list_tasks cancel_task

Shared Workspace

HAL generates interactive artifacts — code, Mermaid diagrams, rendered HTML — in a tabbed workspace panel alongside the chat. Copy, close, or iterate vocally.

create_artifact update_artifact code · markdown · html · mermaid

Multi-Agent Orchestration

Spawn multiple Claude Code agents on parallel tasks — one does frontend, another backend. HAL monitors all agents and detects file conflicts when they overlap.

orchestrate list_agents check_conflicts

Agent Tools

MCP Tools

LLM Providers

Voice Engines

Built by Affordance Design × Shandar Junaid

OllamaLlama 3.1MistralPhi-3faster-whisperEdge TTSZero API KeysFully OfflineLocal LLMLocal STTFree ForeverOpen SourceOllamaLlama 3.1MistralPhi-3faster-whisperEdge TTSZero API KeysFully OfflineLocal LLMLocal STTFree ForeverOpen Source

Free & Open Source

Zero Cost. Zero API Keys.

Run HAL entirely for free with local AI. One toggle in your .env file switches everything to open-source alternatives.

HAL 9000 Free Mode — Ollama brain, faster-whisper STT, Edge TTS, zero API keys

Free Mode Setup

# Install Ollama (local LLM)

brew install ollama

# or on Linux:

curl -fsSL https://ollama.com/install.sh | sh

# Pull a model and go

ollama pull llama3.1

echo "FREE_MODE=true" > .env

python server.py

⚡

What You Get — Free

Layer	Free Provider	Paid Alternative
Brain (LLM)	Ollama — Llama 3.1, Mistral, Phi-3	GPT-4o, Claude, Gemini
Speech-to-Text	faster-whisper (local)	OpenAI Whisper API
Text-to-Speech	Edge TTS (always free)	ElevenLabs, XTTS

macOSWindowsLinuxAppleScriptPowerShellpactlxclipnotify-sendTerminal.appWindows Terminalgnome-terminalOne CodebaseAuto-DetectmacOSWindowsLinuxAppleScriptPowerShellpactlxclipnotify-sendTerminal.appWindows Terminalgnome-terminalOne CodebaseAuto-Detect

Cross-Platform

macOS · Windows · Linux

One codebase, auto-detects your OS. No #ifdef, no separate builds — HAL uses the right system commands on every platform.

HAL 9000 running on macOS, Windows, and Linux

⊕

Feature	macOS	Windows	Linux
Volume	AppleScript	PowerShell	pactl
Notifications	osascript	Toast API	notify-send
Clipboard	pbcopy	Get-Clipboard	xclip
Screenshot	screencapture	PIL.ImageGrab	scrot / grim
Apps	open -a	Start Menu	.desktop files
Terminal	Terminal.app	Windows Terminal	gnome-terminal
Embedded Terminal	✓ xterm.js + PTY	External only	✓ xterm.js + PTY

Integration

Claude Code Integration

Terminal — hal-9000

# Register HAL as an MCP server for Claude Code $ claude mcp add hal-9000 -- python /path/to/HAL9000/hal_mcp_server.py # Now Claude Code can: # hal_see — See through your webcam # hal_speak — Speak aloud in HAL's voice # hal_remember — Store typed memories # hal_get_context — Load session context # macos_* — Control volume, brightness, apps... # hal_chat — Bidirectional conversation with HAL # + 15 more tools (21 total)

Requirements

What You Need

Two tiers — run HAL for free with local AI, or connect to cloud providers for premium quality.

Free Mode (Local AI)

OS	macOS 12+ / Windows 10+ / Ubuntu 20.04+
CPU	4 cores (i5 / M1 / Ryzen 5+)
RAM	8 GB minimum
Disk	6 GB free (models + HAL)
Python	3.10+
Network	Not required (fully offline)
Mic	Required (built-in or USB)
Webcam	Optional (vision features only)

Ollama llama3.1 8B uses ~5 GB RAM. Use phi3 for lighter footprint (~3 GB).

API

Paid Mode (Cloud AI)

OS	macOS 12+ / Windows 10+ / Ubuntu 20.04+
CPU	2 cores (any modern CPU)
RAM	4 GB
Disk	500 MB free
Python	3.10+
Network	Required (API calls)
API Key	OpenAI, Anthropic, or Gemini

Lower resource usage — models run in the cloud. Faster responses.

Performance by Mode

Mode	Brain	STT	TTS	RAM
Free (llama3.1)	2-5s CPU / 1-2s Apple Silicon	1-3s	0.7s	~6 GB
Free (phi3)	1-2s CPU	1-3s	0.7s	~3 GB
Paid (GPT-4o)	1-2s API	0.5s	0.7s	~200 MB
Paid (Claude)	1-3s API	0.5s	0.7-1.2s	~200 MB

Apple Silicon (M1-M4) runs Ollama 2-3x faster via Metal. NVIDIA GPUs (6 GB+ VRAM) get similar acceleration via CUDA.

Installation

Get Started

HAL runs on macOS, Windows, and Linux. Pick your platform — same features everywhere.

Prerequisites

Python 3.10+, Homebrew (recommended). For free mode: Ollama. For paid mode: an OpenAI API key.

# Install Python if needed

brew install python@3.12

# For free mode — install Ollama

brew install ollama && ollama pull llama3.1

Clone & Install

git clone https://github.com/shandar/HAL9000.git
cd HAL9000
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

Configure

# Free mode (no API keys)

echo "FREE_MODE=true" > .env

# OR paid mode — copy template and add keys

cp .env.example .env

nano .env

Key	Required	Purpose
FREE_MODE=true	Free	Skip all keys — uses Ollama + faster-whisper
OPENAI_API_KEY	Paid	GPT-4o brain + Whisper STT
ANTHROPIC_API_KEY	Paid	Claude as brain
GEMINI_API_KEY	Paid	Gemini as brain

Launch

python server.py

open http://localhost:9000

Click the power button to activate HAL. Grant microphone + camera permissions when prompted.

Connect Claude Code (optional)

Requires Claude Code installed on your machine.

claude mcp add hal-9000 -- python /path/to/HAL9000/hal_mcp_server.py

Prerequisites

Python 3.10+ from python.org (check "Add to PATH" during install). For free mode: Ollama for Windows.

# Verify Python

python --version

# For free mode — download Ollama from ollama.com, then:

ollama pull llama3.1

# Install build tools for pyaudio (required for microphone)

pip install pipwin

pipwin install pyaudio

Clone & Install

git clone https://github.com/shandar/HAL9000.git
cd HAL9000
python -m venv .venv
.venv\Scripts\activate
pip install -r requirements.txt

Note: If pyaudio fails, install the Visual C++ Build Tools or use pipwin install pyaudio.

Configure

# Free mode (no API keys)

echo FREE_MODE=true > .env

# OR paid mode — copy template and add keys

copy .env.example .env

notepad .env

Launch

python server.py

Open http://localhost:9000 in your browser. Click power button to activate.

Windows may prompt for firewall access — allow it for localhost only.

Note: The embedded terminal (xterm.js) is not available on Windows (no PTY support). Use an external terminal instead.

Connect Claude Code (optional)

Requires Claude Code installed on your machine.

claude mcp add hal-9000 -- python \path\to\HAL9000\hal_mcp_server.py

Prerequisites

Python 3.10+, system audio libraries, and optionally Ollama for free mode.

# Ubuntu/Debian

sudo apt install python3 python3-venv python3-pip portaudio19-dev

# Fedora/RHEL

sudo dnf install python3 python3-pip portaudio-devel

# For free mode

curl -fsSL https://ollama.com/install.sh | sh

ollama pull llama3.1

# Optional: notification + clipboard + screenshot tools

sudo apt install libnotify-bin xclip scrot

Clone & Install

git clone https://github.com/shandar/HAL9000.git
cd HAL9000
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

Configure

# Free mode (no API keys)

echo "FREE_MODE=true" > .env

# OR paid mode

cp .env.example .env

nano .env

Launch

python server.py

xdg-open http://localhost:9000

Click the power button to activate HAL. Ensure PulseAudio/PipeWire is running for audio.

Connect Claude Code (optional)

Requires Claude Code installed on your machine.

claude mcp add hal-9000 -- python /path/to/HAL9000/hal_mcp_server.py

▶

Using HAL

Interact via voice or text. HAL responds with speech and can control your system, run commands, search the web, manage files, and delegate tasks to Claude Code.

# Voice examples: "Hey HAL, what's on my screen?" "Set volume to 30 and check the battery" "Run the tests in the background" # Chat examples: "Remember that we decided to use PostgreSQL" "What did we work on yesterday?" ← session memory "Show me the config as an artifact" ← workspace

Interactive Tutorial

Learn How HAL Works

Explore real conversation flows, tool chains, and workflows. Click each scenario to see HAL in action — with animated step-by-step breakdowns.

Voice Interaction Flow

Speak naturally — HAL listens via your microphone, transcribes with Whisper, processes through the LLM brain, calls tools if needed, and responds with synthesized speech.

🎙️ You SpeakMic captures audio

→

Whisper STTAudio → text

→

Brain + ToolsLLM processes

→

TTS EngineText → speech

→

🔊 HAL SpeaksBrowser audio

YOU

Voice input "Hey HAL, what's on my screen right now?"

hal_screenshot() → Screenshot captured (2.1 MB)

HAL

Voice response You have VS Code open with a Python file — it appears to be your server configuration. There's also a browser tab with the HAL documentation. Terminal is running python server.py.

Chat with Disambiguation

When a request is ambiguous, HAL presents numbered choices in a slide-up sheet. You pick one — no need to wait for HAL to read options aloud.

💬 You Type"Open Claude Code"

→

Brain DetectsAmbiguous request

→

Choice SheetUI shows options

→

You PickTap a choice

→

Tool ExecutesAction runs

YOU

Text input Open Claude Code

HAL

HAL responds (spoken) Which one?

Which one?

1 Claude Desktop app

2 Claude Code terminal CLI

YOU

open_claude_code(cwd="~/projects") → Terminal opened

HAL

Claude Code CLI is open in your projects directory.

43 Tools at HAL's Disposal

HAL's brain has function calling — it picks the right tool for any request. Each use case below shows the tool chain in action.

Open an Application

Ask HAL to launch any installed app with AppleScript automation.

"Open Slack" → open_app(name="Slack")
HAL uses macOS open -a to launch Slack. Works with any of your 113+ installed apps. If ambiguous, HAL shows the choice sheet.

Read & Write Files

HAL reads, writes, searches, and manages files on your Mac.

"What's in my config.py?" → read_file(path="config.py")
HAL reads the file contents (up to 1500 chars), then summarizes the key settings: AI provider, TTS provider, model names, token limits.

Control macOS

Volume, brightness, clipboard, notifications, Wi-Fi, battery status.

"Set volume to 40 and tell me the battery level" → set_volume(level=40) + battery_status()
HAL chains multiple tools in one turn. The brain calls both, returns: "Volume set to 40%. Battery at 78%, charging."

Web Search

Search the web and fetch page content in real time.

"Search for the latest Python 3.13 features" → web_search(query="Python 3.13 new features")
HAL searches DuckDuckGo, returns top 5 results with titles and snippets, then summarizes the key highlights vocally.

Shell Commands

Run whitelisted shell commands with security guardrails.

"Show me the git log for this project" → run_shell(cmd="git log --oneline -10")
HAL runs the command in a sandboxed shell. Dangerous commands (sudo, rm -rf /, shutdown) are blocked. 70+ safe commands are whitelisted.

Vision Analysis

Ask about your webcam, screen, or surroundings.

"What am I holding?" → Vision keyword detected → webcam frame attached to LLM request
HAL analyzes the frame: "You're holding a coffee mug. It appears to be the large white one with text on it."

Background Tasks & Multi-Agent

Delegate heavy coding work to Claude Code agents running in the background. Monitor progress, run multiple agents in parallel, and detect file conflicts.

You Request"Refactor auth"

→

HAL QueuesTask runner

→

Claude CodeRuns silently

→

Result StoredIn typed memory

→

HAL ReportsProgress in UI

YOU

Run the test suite in the background while I keep working

background_task(task="run pytest", cwd="~/project") → Task a3f1 submitted

HAL

Tests are running in the background. Task ID a3f1. I'll let you know when they finish. What else do you need?

YOU

Also orchestrate — frontend agent builds the login form, backend agent adds the auth API endpoint

orchestrate([{name:"frontend", task:"..."}, {name:"backend", task:"..."}])

HAL

Two agents spawned — frontend and backend. Both are working. I'll flag any file conflicts.

Cross-Agent Memory & Context Handoff

HAL's typed memory persists facts, decisions, preferences, and session summaries. When a Claude Code session ends, context is distilled and stored — the next session picks up where you left off.

Session 1HAL + you work

→

Auto-SummaryOn shutdown

→

facts.jsonTyped entries

→

Session 2Context loaded

→

"Continue"Full context

fact

"The user is Shandar Junaid, creator of HAL9000."

decision

"Use layered memory architecture — HAL owns persistent context, Claude sessions get ephemeral."

session_summary

"Session lasted 42m. 15 user messages, 18 HAL responses. Tools used: background_task, orchestrate, remember. Topics: auth refactor, test suite."

preference

"Dark mode first. Monospace fonts. Edge TTS for voice."

YOU

What did we work on last session?

recall(query="", type="session_summary") → 1 session found

HAL

Last session was 42 minutes. We worked on the auth refactor and ran the test suite. Two background agents were used — frontend and backend. All tests passed. There was one file conflict on api.py that was resolved.

Voice CommandsTool ChainingDisambiguationBackground JobsAgent SpawningMemory RecallSession HandoffCode DelegationWeb SearchApp ControlVoice CommandsTool ChainingDisambiguationBackground JobsAgent SpawningMemory RecallSession HandoffCode DelegationWeb SearchApp Control

Demo

Watch HAL Think

Pick a capability — watch HAL process real requests from input through tool execution to response.

HAL 9000 — Tools 1/3 — System Control

Basic Feature Walkthrough

This covers only the basics. Download and run HAL to experience the full magic.

Why HAL

What Makes HAL Different

Fully Local & Private

Nothing is recorded, stored, or sent. No telemetry, no tracking, no cloud. Vision and voice can each be toggled off independently. Mic is click-only — never listens without your action.

Free Tier

Ollama + faster-whisper + Edge TTS. Zero API keys, zero cost. Paid providers optional for premium quality.

Cross-Platform

macOS, Windows, Linux — one codebase. Auto-detects your OS and uses native system commands.

Co-Work Hub

Not just an assistant — an operations platform. Background tasks, multi-agent orchestration, shared memory with Claude Code.

Open Source

MIT licensed. Fork it, extend it, make it yours. Every line of code is readable and documented.

43 Tools Built In

Shell, files, apps, web search, clipboard, screenshots, memory, Claude Code delegation — all from voice or chat.

Unlimited MemorySemantic SearchScheduled TasksSmart RoutingHAL VoiceVoice CloningKnowledge PacksPriority Support8 AgentsAnalyticsUnlimited MemorySemantic SearchScheduled TasksSmart RoutingHAL VoiceVoice CloningKnowledge PacksPriority Support8 AgentsAnalytics

Pricing

Free Forever. Pro When You're Ready.

The full agent — 43 tools, 4 LLM providers, cross-platform — is free and open source. Pro tiers unlock power features for serious builders.

Free & Open Source

FREE

forever — no credit card, no license key

✓ 43 OS-level tools
✓ 4 LLM providers (incl. free Ollama)
✓ 3 voice engines (Edge TTS free)
✓ Local STT (faster-whisper free)
✓ Background tasks (2 concurrent)
✓ Multi-agent orchestration (4 agents)
✓ Artifact workspace
✓ 100 persistent memories
✓ Session auto-summarize + handoff
✓ MCP server (20 tools for Claude Code)
✓ macOS + Windows + Linux
✓ MIT open source — fork it, own it

Pro

Monthly Annual save 20%

₹399/mo

billed monthly · ~$5/mo

Switch to annual to save 20%

✓ Everything in Free, plus:
+ Unlimited persistent memories
+ Semantic memory search (embeddings)
+ Scheduled tasks (cron-style)
+ Smart model routing (auto Ollama ↔ GPT-4o)
+ Session analytics dashboard
+ Original HAL 9000 voice (ElevenLabs)
+ Custom voice cloning
+ Custom knowledge packs
+ 8 concurrent agents
+ 5 parallel background tasks
+ Priority support

Same codebase, same repo. Pro is unlocked with a license key in your .env.
Keys are offline-verified — no phone-home, no telemetry, no tracking.

Get HAL

Three Commands. That's It.

Clone, install, run. HAL handles the rest.

git clone https://github.com/shandar/HAL9000.git && cd HAL9000 && pip install -r requirements.txt Click to copy

View on GitHub Full Setup Guide ↓

🛠️

Want to build something like HAL?

Learn to build AI agents, multimodal systems, and production-grade tools from scratch — the same architecture behind HAL 9000.

Join the Bootcamp →

By Affordance Design — Design Engineering Studio

HAL 9000

What I Can Do

See

Hear

Think

Speak

Act

Learn

Co-Work

Teach HAL. Watch It Learn.

Knowledge Upload

BM25 Search

Deep Read & Skim

Knowledge Panel

Chat Cursor

Cycling Tips

How It Works

Your AI Operations Hub

Typed Memory & Context Handoff

Background Task Runner

Shared Workspace

Multi-Agent Orchestration

Zero Cost. Zero API Keys.

Free Mode Setup

What You Get — Free

macOS · Windows · Linux

Claude Code Integration

What You Need

Free Mode (Local AI)

Paid Mode (Cloud AI)

Performance by Mode

Get Started

Prerequisites

Clone & Install

Configure

Launch

Connect Claude Code (optional)

Prerequisites

Clone & Install

Configure

Launch

Connect Claude Code (optional)

Prerequisites

Clone & Install

Configure

Launch

Connect Claude Code (optional)

Using HAL

Learn How HAL Works

Which one?

Open an Application

Read & Write Files

Control macOS

Web Search

Shell Commands

Vision Analysis

Watch HAL Think

Basic Feature Walkthrough

What Makes HAL Different

Fully Local & Private

Free Tier

Cross-Platform

Co-Work Hub

Open Source

43 Tools Built In

Free Forever. Pro When You're Ready.

Three Commands. That's It.

Want to build something like HAL?

Get HAL Pro

The Origin of HAL