How to Use Local AI During a Proctored Exam on Mac
Local AI means: no cloud, no API call, no traffic to OpenAI or Anthropic. The model runs on your Mac via Ollama, llama.cpp, or LM Studio. The LDBypass overlay carries the chat UI for whichever you choose. The combination - invisible-to-capture overlay plus on-device LLM - is the most private setup possible.
Three paths to local: Ollama (one-click install, simplest), LM Studio (Mac app with model browser), llama.cpp (raw, command-line, fastest). All three expose an HTTP API on localhost that the LDBypass overlay can point at. Mistral 7B and Llama 3.1 8B are the standard picks for routine homework; DeepSeek Coder for code. Pre-warm the model by running one prompt before the exam (cold-start can be slow). During the exam, the overlay carries the chat interface; the model runs on your Mac's Apple Silicon GPU. Nothing leaves the device.
Key points
- Three paths: Ollama (easiest), LM Studio (prettiest), llama.cpp (rawest).
- Mistral 7B / Llama 3.1 8B / DeepSeek Coder are the practical picks.
- Pre-warm the model before the exam to avoid cold-start lag.
- Plug in - inference is GPU-heavy, drains battery.
- Zero outbound traffic = no proctor network detection.
How it works
Pre-exam (one-time): brew install ollama # or download from ollama.com ollama pull mistral ollama serve Configure overlay: Settings -> AI provider -> Ollama (or URL: http://localhost:11434/) During exam: Ctrl+Cmd+L -> overlay shows local AI Type. Localhost only.
Common questions
Is local AI as good as cloud?
7B local ~ GPT-3.5; cloud frontier ~ GPT-4. For routine homework explanation, local is fine. For hard problems, cloud is stronger.
Will the proctor app see Ollama running?
It can enumerate processes and find Ollama by name. Use llama.cpp with a renamed binary if you want zero process-list footprint.
Can I run multiple models at once?
Yes if you have RAM. 16 GB Mac comfortably runs Mistral 7B + DeepSeek Coder 6.7B simultaneously.