Model Providers
FileMind uses an AI model for three things: metadata extraction fallback when DOI/heuristic parsing fails, the Ask My Library RAG feature, and optional query rewriting. You can run that model entirely on your own machine with Ollama, or call out to a cloud provider like Anthropic or OpenAI.
This page walks through each option end to end — assuming you don't yet have an account. Pick whichever provider fits your privacy, cost, and quality preferences. You can change providers at any time from Settings → AI Model or by editing config.toml.
Choosing a Provider
| Provider | Where it runs | Cost | Privacy | Best for |
|---|---|---|---|---|
| Ollama | Your machine | Free | Nothing leaves your computer | Local-first users, sensitive libraries, offline use |
| Anthropic (Claude) | Cloud API | Pay per token | Snippets sent to API; not used for training | Highest extraction and Q&A quality |
| OpenAI (GPT) | Cloud API | Pay per token | Snippets sent to API; API traffic is not used for training by default | Strong all-rounder, broad model selection |
When using a cloud provider, FileMind only sends short text snippets (titles, abstracts, or the chunks needed to answer your question) — never the full PDF. Embeddings and OCR always run locally regardless of which LLM provider you choose.
Setting Up Ollama (Local)
Ollama is a free, open-source runtime that lets you run large language models on your own machine. It is the default provider FileMind suggests during the Setup Wizard. Everything stays on your computer — no account, no API key, no internet connection required after the model is downloaded.
Step 1 — Check your hardware
Local models use RAM (and a GPU, if you have one). Use this as a rough guide:
| RAM | Recommended model | Disk |
|---|---|---|
| 8 GB | llama3.2:3b | ~2 GB |
| 16 GB | mistral | ~4 GB |
| 32 GB+ | llama3.1:8b or qwen2.5:14b | ~5–9 GB |
A dedicated GPU (NVIDIA CUDA on Windows/Linux, Apple Silicon on Mac) makes responses noticeably faster, but it is not required. CPU-only systems work fine, just slower.
Step 2 — Install Ollama
The FileMind Setup Wizard can install Ollama for you on first run. If you'd rather do it yourself, or the wizard couldn't reach the network:
- Visit ollama.com/download
-
Download the installer for your platform:
- Windows — run
OllamaSetup.exeand follow the prompts - macOS — open the
.dmgand drag Ollama to Applications - Linux — run the one-line install command shown on the download page
- Windows — run
- Launch Ollama once. It runs as a background service and listens on
http://localhost:11434.
Verify the install by opening a terminal and running:
ollama --version Step 3 — Download a model
Pull a model that matches your hardware (see Step 1). In a terminal:
# Recommended default for 16 GB RAM
ollama pull mistral
# Smaller, for 8 GB RAM
ollama pull llama3.2:3b
# Larger, for 32 GB+ RAM
ollama pull llama3.1:8b The first pull downloads several gigabytes; subsequent runs are instant. Confirm the model is available:
ollama list Step 4 — Point FileMind at Ollama
Open Settings → AI Model in FileMind and choose Ollama (local),
then pick the model you pulled. Or edit config.toml directly:
[llm]
provider = "ollama"
model = "mistral"
base_url = "http://localhost:11434" # default — only set if you customized Ollama's port
temperature = 0.1
timeout_seconds = 120 Step 5 — Verify
- Click Test connection in Settings → AI Model
- Or re-run the wizard from Settings → Setup Wizard — the final step issues a real test query
If the test fails, see the Ollama troubleshooting section.
The most common issue is that the Ollama service isn't running — open a terminal and run
ollama serve to start it manually.
Setting Up Anthropic (Claude)
Anthropic's Claude models tend to produce the highest-quality metadata extraction and citations in Ask My Library. You'll need to create an account, add a payment method, and generate an API key — total time, about five minutes.
Step 1 — Create an Anthropic account
- Go to console.anthropic.com
- Click Sign up and register with email or Google
- Verify your email address from the confirmation message
- Complete the short onboarding form (name, organization, intended use)
Note: this is the API console, not the consumer Claude.ai chat product. They use separate accounts and separate billing — a Claude.ai Pro subscription does not include API access.
Step 2 — Add billing and credits
New Anthropic API accounts ship with no credits, so the first API call will fail until you add a payment method.
- In the console, open Settings → Billing (or Plans & Billing)
- Add a credit or debit card
- Purchase a starting credit balance — $5–$10 is plenty to evaluate FileMind on a typical library
- Optional: enable auto-reload so usage doesn't stop mid-scan
Anthropic also requires you to be on a paid usage tier before unlocking higher rate limits. Tier 1 is the default once you've added a payment method.
Step 3 — Generate an API key
- In the console, open Settings → API Keys
- Click Create Key
- Give the key a descriptive name (e.g.
filemind-laptop) and create it - Copy the key immediately — it begins with
sk-ant-and is only shown once. If you lose it, you'll need to revoke it and create a new one.
Treat the key like a password. Anyone with it can spend your credits.
Step 4 — Point FileMind at Anthropic
In Settings → AI Model, choose Anthropic, paste the key,
and pick a model. Or edit config.toml:
[llm]
provider = "anthropic"
model = "claude-sonnet-4-20250514"
anthropic_api_key = "sk-ant-..."
temperature = 0.1
timeout_seconds = 120 Suggested models:
claude-sonnet-4-...— best balance of quality and price; the recommended defaultclaude-haiku-4-5-...— cheapest and fastest; great for high-volume libraries where Sonnet would be overkillclaude-opus-4-...— highest quality; reserve for tricky extraction cases
Use the latest model identifiers shown on the
Anthropic models page.
If you'd rather not paste the key into a config file, set
FILEMIND_LLM__ANTHROPIC_API_KEY as an environment variable instead — see
Environment Variable Overrides.
Step 5 — Verify and watch costs
- Click Test connection in Settings → AI Model to issue a one-token probe
- Run a small scan (a folder with ~10 PDFs) and check the Usage tab in the Anthropic console
- Set a monthly spend limit in the console under Billing → Limits to cap unexpected usage
Typical FileMind usage is small — a few cents per hundred PDFs for metadata extraction. The bulk of cost (if any) comes from heavy Ask My Library use.
Setting Up OpenAI (GPT)
OpenAI's GPT models are a strong all-rounder and offer a wide selection from very cheap
(gpt-4o-mini) to top-tier (gpt-4o and successors). Setup takes about
five minutes.
Step 1 — Create an OpenAI Platform account
- Go to platform.openai.com/signup
- Sign up with email, Google, Microsoft, or Apple
- Verify your email address
- Verify a phone number — required to unlock API access
- Complete the onboarding (name, organization name, intended use)
As with Anthropic, this is the developer Platform, separate from a ChatGPT Plus subscription. ChatGPT Plus does not include API credits.
Step 2 — Add billing and prepay for credits
OpenAI's API uses prepaid credits. You must add a payment method and buy a starting balance before any API call will succeed.
- Open Settings → Billing
- Click Add payment method and enter a card
- Click Add to credit balance and purchase $5–$10 to start
- Optional: enable Auto recharge so scans don't pause when the balance runs low
New accounts start at Usage Tier 1, which has modest rate limits. After a few days of paid usage you'll automatically move to higher tiers with looser limits.
Step 3 — Generate an API key
- Go to platform.openai.com/api-keys
- Click Create new secret key
- Name it (e.g.
filemind-laptop) and, optionally, restrict permissions to Read & write → Model capabilities only - Copy the key immediately — it begins with
sk-(orsk-proj-) and is only shown once. Lost keys must be revoked and recreated.
If you belong to multiple OpenAI organizations, double-check the org dropdown in the top-left of the Platform — keys are scoped per org, and using the wrong org will return billing or permission errors.
Step 4 — Point FileMind at OpenAI
In Settings → AI Model, choose OpenAI, paste the key,
and pick a model. Or edit config.toml:
[llm]
provider = "openai"
model = "gpt-4o"
api_key = "sk-..."
temperature = 0.1
timeout_seconds = 120 Suggested models:
gpt-4o-mini— cheapest; ideal for metadata extraction at scalegpt-4o— strong default for both extraction and Ask My Library- Newer flagship models (e.g.
gpt-4.1,o4-minireasoning models) are supported as soon as your key has access
To use an Azure OpenAI deployment or a self-hosted compatible endpoint, set
base_url to the OpenAI-compatible URL.
Prefer not to write the key into config.toml? Set the environment variable
FILEMIND_LLM__API_KEY instead.
Step 5 — Verify and cap spend
- Click Test connection in Settings → AI Model
- Run a small scan and watch the Usage dashboard at platform.openai.com/usage
- Set a hard monthly budget at Settings → Limits — once reached, the API stops responding instead of charging more
Switching Providers Later
You can switch providers at any time. FileMind only stores extracted metadata, embeddings, and chunk indices — none of which are tied to the LLM you used. After switching:
- Existing rename proposals are unaffected (they were already generated)
- New scans use the new provider
- Ask My Library answers immediately use the new provider
To run side-by-side comparisons, change the provider, ask the same question, and check the cited sources. Embeddings, search ranking, and reranking are provider-independent.
Privacy Reference
- Ollama — no network calls; everything is on-device.
- Anthropic — only the snippets needed for the current request leave your machine. Anthropic's API terms state that API inputs are not used to train models by default.
- OpenAI — same as above; API inputs are not used for training by default. Review the relevant provider's data-use page if your library is sensitive.
- OCR text extraction, embedding generation, hybrid search, and reranking are local in every configuration.
Next Steps
- Re-run the Setup Wizard to switch providers interactively
- All language-model configuration options
- Troubleshooting common provider issues