FileMind
Blog
tutorial setup

Setting Up Ollama with FileMind for Private Q&A

FileMind Team · · 2 min read

FileMind’s “Ask My Library” feature uses a local AI model to answer questions about your research papers — with citations. The setup wizard handles most of this automatically, but it helps to understand what’s happening under the hood.

What is Ollama?

Ollama is a tool for running large language models locally. It manages model downloads, provides a simple REST API, and handles GPU/CPU inference. FileMind uses Ollama to run the language model that powers semantic search and Q&A.

Automatic setup

When you first launch FileMind, the guided setup wizard:

  1. Detects whether Ollama is already installed
  2. Downloads and runs the Ollama installer if needed
  3. Recommends a model based on your available RAM and GPU
  4. Downloads the model (~2–5 GB depending on model size)
  5. Runs a test query to verify everything works

For most users, you don’t need to do anything manually.

Manual setup

If you prefer to install Ollama yourself, download it from ollama.ai and run:

ollama pull llama3.2:3b

Then launch FileMind and it will automatically detect the running Ollama instance.

Choosing a model

FileMind recommends models based on your hardware. For most researchers, llama3.2:3b (2 GB) works well on 8 GB RAM machines. If you have 16 GB+ RAM or a GPU, llama3.1:8b gives noticeably better Q&A quality.