Private AI for Your Notes: Local Ollama vs Bring-Your-Own Key

AI assistance for notes sounds useful until you remember that your notes are private. Meeting notes, health research, half-formed ideas you would never share publicly: they all live in your notes app. The question worth asking before turning on any AI feature is: where do my words actually go?

NoteLace takes a simple position on this. The AI assistant is optional and off by default. When you do enable it, the connection is between your app and the provider you chose: NoteLace never runs a model, never proxies your prompts, and never sees your notes. Two approaches are supported, and which one fits you depends on how you weigh privacy, capability, cost, and whether you need to work offline.

Option 1: Local Model via Ollama

Ollama lets you run open-weight language models entirely on your own machine. Once installed, you point NoteLace at your local Ollama endpoint and the model runs there; nothing leaves your computer.

What this looks like in practice:

# Pull a model once
ollama pull llama3

# Ollama runs a local server at http://localhost:11434
# Point NoteLace at that endpoint in Settings > AI Assistant

Every prompt goes from the app to Ollama on your own machine and back. No network call leaves your device. You can work offline, air-gapped, or on a flight; the model is always available as long as Ollama is running.

Trade-offs to know:

Capability: The models that fit comfortably on a laptop are smaller than frontier hosted models. They are genuinely useful for summarising, rephrasing, and continuing a thought, but they may struggle with complex reasoning or very long documents.
Hardware: A dedicated GPU (or Apple Silicon with unified memory) makes a real difference. A good consumer machine can run a 7B or 13B parameter model comfortably; older or lower-memory machines will be slower.
Cost: Free to run once you have the hardware. No subscriptions, no per-token billing.

If your notes are sensitive (medical, legal, personal), a local model is the right default. The privacy guarantee is absolute: the model runs on your hardware and your data never travels.

See private AI notes for a deeper look at why local inference matters for note-taking specifically.

Option 2: Bring Your Own API Key

If you already pay for OpenAI, Anthropic, Mistral, or another hosted provider, you can configure NoteLace to use that account directly. You paste your API key into Settings; from that point on the app sends your selected text to that provider's API and returns the result to you.

What this means for privacy:

Your selected text goes to the hosted provider under your account and their terms of service. The models at these providers are generally more capable than what runs locally, especially for longer documents or tasks that need broader reasoning. But your text does leave your device.

This is the right choice when:

You already use and trust a particular provider
You need the best possible output quality: long-form editing, complex summarisation
Your notes are not highly sensitive, or you are comfortable with the provider's data policies
You want the best results without investing in local GPU hardware

Costs are between you and your provider. NoteLace does not charge anything for AI usage: there is no markup, no proxy, and no usage tracked on our end.

How the Assistant Actually Works

Whether you use Ollama or a hosted key, the workflow is the same:

Select some text in the editor.
Open the command palette or right-click and choose an action: Improve writing, Summarise, or Continue.
The request goes directly from your app to the endpoint you configured.
The result comes back and you choose whether to accept it.

The assistant only acts on text you explicitly select and only when you explicitly invoke it. It does not read your whole vault, does not run in the background, and does not index your notes. It is a deliberate, scoped action, not a background agent.

Which Should You Choose?

	Local Ollama	Bring your own key
Privacy	Maximum: nothing leaves your device	Good: goes to your provider, not NoteLace
Capability	Solid for everyday tasks	Best-in-class models available
Cost	Free to run	Pay-per-token to your provider
Offline	Yes	No
Setup	Install Ollama + pull a model	Paste an API key

A reasonable starting point: use Ollama for anything sensitive or offline, and bring a hosted key when you want the best quality on less sensitive material.

Notes Stay in Your Local Database

This is worth saying clearly: your notes live in a local-first database on your device. They are not uploaded to NoteLace servers. You can export any note to .md at any time. The AI assistant (whichever approach you choose) works on text you hand it, not on a copy of your database that lives in the cloud.

Optional end-to-end encrypted sync is available if you want your notes on multiple devices, but it is off by default, and even then the sync server cannot read your notes.

Getting Started

The AI assistant is in Settings under AI Assistant. Select your mode (Ollama or API key), fill in the endpoint or key, and it is ready. If you are new to Ollama, their documentation gets a model running in a few minutes.

NoteLace is available for macOS, Windows, and Linux with a 14-day card-free trial.