FileMind

Model Providers

FileMind uses an AI model for three things: metadata extraction fallback when DOI/heuristic parsing fails, the Ask My Library RAG feature, and optional query rewriting. You can run that model entirely on your own machine with Ollama, or call out to a cloud provider like Anthropic or OpenAI.

This page walks through each option end to end — assuming you don't yet have an account. Pick whichever provider fits your privacy, cost, and quality preferences. You can change providers at any time from Settings → AI Model or by editing config.toml.

Choosing a Provider

Provider Where it runs Cost Privacy Best for
Ollama Your machine Free Nothing leaves your computer Local-first users, sensitive libraries, offline use
Anthropic (Claude) Cloud API Pay per token Snippets sent to API; not used for training Highest extraction and Q&A quality
OpenAI (GPT) Cloud API Pay per token Snippets sent to API; API traffic is not used for training by default Strong all-rounder, broad model selection

When using a cloud provider, FileMind only sends short text snippets (titles, abstracts, or the chunks needed to answer your question) — never the full PDF. Embeddings and OCR always run locally regardless of which LLM provider you choose.

Setting Up Ollama (Local)

Ollama is a free, open-source runtime that lets you run large language models on your own machine. It is the default provider FileMind suggests during the Setup Wizard. Everything stays on your computer — no account, no API key, no internet connection required after the model is downloaded.

Step 1 — Check your hardware

Local models use RAM (and a GPU, if you have one). Use this as a rough guide:

RAMRecommended modelDisk
8 GBllama3.2:3b~2 GB
16 GBmistral~4 GB
32 GB+llama3.1:8b or qwen2.5:14b~5–9 GB

A dedicated GPU (NVIDIA CUDA on Windows/Linux, Apple Silicon on Mac) makes responses noticeably faster, but it is not required. CPU-only systems work fine, just slower.

Step 2 — Install Ollama

The FileMind Setup Wizard can install Ollama for you on first run. If you'd rather do it yourself, or the wizard couldn't reach the network:

  1. Visit ollama.com/download
  2. Download the installer for your platform:
    • Windows — run OllamaSetup.exe and follow the prompts
    • macOS — open the .dmg and drag Ollama to Applications
    • Linux — run the one-line install command shown on the download page
  3. Launch Ollama once. It runs as a background service and listens on http://localhost:11434.

Verify the install by opening a terminal and running:

ollama --version

Step 3 — Download a model

Pull a model that matches your hardware (see Step 1). In a terminal:

# Recommended default for 16 GB RAM
ollama pull mistral

# Smaller, for 8 GB RAM
ollama pull llama3.2:3b

# Larger, for 32 GB+ RAM
ollama pull llama3.1:8b

The first pull downloads several gigabytes; subsequent runs are instant. Confirm the model is available:

ollama list

Step 4 — Point FileMind at Ollama

Open Settings → AI Model in FileMind and choose Ollama (local), then pick the model you pulled. Or edit config.toml directly:

[llm]
provider = "ollama"
model = "mistral"
base_url = "http://localhost:11434"  # default — only set if you customized Ollama's port
temperature = 0.1
timeout_seconds = 120

Step 5 — Verify

  1. Click Test connection in Settings → AI Model
  2. Or re-run the wizard from Settings → Setup Wizard — the final step issues a real test query

If the test fails, see the Ollama troubleshooting section. The most common issue is that the Ollama service isn't running — open a terminal and run ollama serve to start it manually.

Setting Up Anthropic (Claude)

Anthropic's Claude models tend to produce the highest-quality metadata extraction and citations in Ask My Library. You'll need to create an account, add a payment method, and generate an API key — total time, about five minutes.

Step 1 — Create an Anthropic account

  1. Go to console.anthropic.com
  2. Click Sign up and register with email or Google
  3. Verify your email address from the confirmation message
  4. Complete the short onboarding form (name, organization, intended use)

Note: this is the API console, not the consumer Claude.ai chat product. They use separate accounts and separate billing — a Claude.ai Pro subscription does not include API access.

Step 2 — Add billing and credits

New Anthropic API accounts ship with no credits, so the first API call will fail until you add a payment method.

  1. In the console, open Settings → Billing (or Plans & Billing)
  2. Add a credit or debit card
  3. Purchase a starting credit balance — $5–$10 is plenty to evaluate FileMind on a typical library
  4. Optional: enable auto-reload so usage doesn't stop mid-scan

Anthropic also requires you to be on a paid usage tier before unlocking higher rate limits. Tier 1 is the default once you've added a payment method.

Step 3 — Generate an API key

  1. In the console, open Settings → API Keys
  2. Click Create Key
  3. Give the key a descriptive name (e.g. filemind-laptop) and create it
  4. Copy the key immediately — it begins with sk-ant- and is only shown once. If you lose it, you'll need to revoke it and create a new one.

Treat the key like a password. Anyone with it can spend your credits.

Step 4 — Point FileMind at Anthropic

In Settings → AI Model, choose Anthropic, paste the key, and pick a model. Or edit config.toml:

[llm]
provider = "anthropic"
model = "claude-sonnet-4-20250514"
anthropic_api_key = "sk-ant-..."
temperature = 0.1
timeout_seconds = 120

Suggested models:

  • claude-sonnet-4-... — best balance of quality and price; the recommended default
  • claude-haiku-4-5-... — cheapest and fastest; great for high-volume libraries where Sonnet would be overkill
  • claude-opus-4-... — highest quality; reserve for tricky extraction cases

Use the latest model identifiers shown on the Anthropic models page. If you'd rather not paste the key into a config file, set FILEMIND_LLM__ANTHROPIC_API_KEY as an environment variable instead — see Environment Variable Overrides.

Step 5 — Verify and watch costs

  1. Click Test connection in Settings → AI Model to issue a one-token probe
  2. Run a small scan (a folder with ~10 PDFs) and check the Usage tab in the Anthropic console
  3. Set a monthly spend limit in the console under Billing → Limits to cap unexpected usage

Typical FileMind usage is small — a few cents per hundred PDFs for metadata extraction. The bulk of cost (if any) comes from heavy Ask My Library use.

Setting Up OpenAI (GPT)

OpenAI's GPT models are a strong all-rounder and offer a wide selection from very cheap (gpt-4o-mini) to top-tier (gpt-4o and successors). Setup takes about five minutes.

Step 1 — Create an OpenAI Platform account

  1. Go to platform.openai.com/signup
  2. Sign up with email, Google, Microsoft, or Apple
  3. Verify your email address
  4. Verify a phone number — required to unlock API access
  5. Complete the onboarding (name, organization name, intended use)

As with Anthropic, this is the developer Platform, separate from a ChatGPT Plus subscription. ChatGPT Plus does not include API credits.

Step 2 — Add billing and prepay for credits

OpenAI's API uses prepaid credits. You must add a payment method and buy a starting balance before any API call will succeed.

  1. Open Settings → Billing
  2. Click Add payment method and enter a card
  3. Click Add to credit balance and purchase $5–$10 to start
  4. Optional: enable Auto recharge so scans don't pause when the balance runs low

New accounts start at Usage Tier 1, which has modest rate limits. After a few days of paid usage you'll automatically move to higher tiers with looser limits.

Step 3 — Generate an API key

  1. Go to platform.openai.com/api-keys
  2. Click Create new secret key
  3. Name it (e.g. filemind-laptop) and, optionally, restrict permissions to Read & write → Model capabilities only
  4. Copy the key immediately — it begins with sk- (or sk-proj-) and is only shown once. Lost keys must be revoked and recreated.

If you belong to multiple OpenAI organizations, double-check the org dropdown in the top-left of the Platform — keys are scoped per org, and using the wrong org will return billing or permission errors.

Step 4 — Point FileMind at OpenAI

In Settings → AI Model, choose OpenAI, paste the key, and pick a model. Or edit config.toml:

[llm]
provider = "openai"
model = "gpt-4o"
api_key = "sk-..."
temperature = 0.1
timeout_seconds = 120

Suggested models:

  • gpt-4o-mini — cheapest; ideal for metadata extraction at scale
  • gpt-4o — strong default for both extraction and Ask My Library
  • Newer flagship models (e.g. gpt-4.1, o4-mini reasoning models) are supported as soon as your key has access

To use an Azure OpenAI deployment or a self-hosted compatible endpoint, set base_url to the OpenAI-compatible URL.

Prefer not to write the key into config.toml? Set the environment variable FILEMIND_LLM__API_KEY instead.

Step 5 — Verify and cap spend

  1. Click Test connection in Settings → AI Model
  2. Run a small scan and watch the Usage dashboard at platform.openai.com/usage
  3. Set a hard monthly budget at Settings → Limits — once reached, the API stops responding instead of charging more

Switching Providers Later

You can switch providers at any time. FileMind only stores extracted metadata, embeddings, and chunk indices — none of which are tied to the LLM you used. After switching:

  • Existing rename proposals are unaffected (they were already generated)
  • New scans use the new provider
  • Ask My Library answers immediately use the new provider

To run side-by-side comparisons, change the provider, ask the same question, and check the cited sources. Embeddings, search ranking, and reranking are provider-independent.

Privacy Reference

  • Ollama — no network calls; everything is on-device.
  • Anthropic — only the snippets needed for the current request leave your machine. Anthropic's API terms state that API inputs are not used to train models by default.
  • OpenAI — same as above; API inputs are not used for training by default. Review the relevant provider's data-use page if your library is sensitive.
  • OCR text extraction, embedding generation, hybrid search, and reranking are local in every configuration.

Next Steps