Skip to content

Vox Documentation

Fast, accurate voice transcription with AI enhancement for macOS and Windows.

What is Vox?

Vox is a desktop application for macOS and Windows that transcribes your speech into text. It uses Whisper, an advanced open-source speech recognition technology from OpenAI, running entirely on your device. All processing happens locally, ensuring your privacy while providing fast and accurate transcriptions.

Key Features

  • Local Processing - Speech recognition runs entirely on your device
  • AI Enhancement - Optional post-processing with AWS Bedrock, DeepSeek, or custom LLM providers
  • Always Available - System tray integration with customizable keyboard shortcuts
  • Smart Dictionary - Teach Vox custom words and phrases for better accuracy
  • Multi-Language - Support for English, Portuguese, Spanish, French, German, and more
  • Privacy First - Your audio never leaves your device

Quick Start

  1. Download Vox from the official website
  2. Install and launch the application
  3. Grant required permissions (Microphone, Accessibility, Keychain)
  4. Download a speech model (Accurate recommended for most users)
  5. Start recording with ⌘ + Space (hold) or ⌘ + ⌥ + Space (toggle)

New to Vox?

Follow our Getting Started guide for detailed setup instructions.

Documentation

Need Help?

Built with 💜 by the open-source community & core contributors