FileMind

Search & Ask

FileMind gives you three ways to find information in your library, plus an AI-powered Q&A feature that answers questions using your own papers — with citations.

Search Modes

Keyword Search (FTS)

Full-text keyword search powered by SQLite FTS5. Finds exact words and phrases across every page of every paper. Best when you know the specific terms you're looking for.

Semantic Search

Meaning-based search using sentence-transformer embeddings (all-MiniLM-L6-v2 by default). Finds papers about a concept even if they use different terminology — searching for "attention mechanisms" will find papers that discuss "self-attention" or "transformer architecture."

Hybrid Search (Default)

Combines keyword and semantic search, merging and deduplicating results for the best coverage. This is the default mode and works well for most queries.

Search Filters

Narrow results using faceted filters:

  • Year range — filter by publication year
  • Authors — multi-select author filter
  • Venue / Journal — filter by publication venue
  • Index status — show only fully indexed, partially OCR'd, or errored files
  • Has DOI / arXiv — filter by identifier availability
  • Confidence threshold — filter by metadata confidence score
  • Date added — filter by when files were added to FileMind
  • File size — small, medium, large, extra-large buckets
  • Exclude duplicates — hide duplicate entries

Results can be sorted by relevance, year, date added, or title. You can also group results by file to collapse multiple matches from the same PDF.

Search Results

Each result shows the paper title, filename, matching page range, and a text snippet with the relevant passage. Click any result to open the PDF at the matching page.

Ask My Library

Type a question in plain English and get an answer drawn from your own papers — with citations enforcing every claim. FileMind will not answer without a source.

How It Works

  1. Query rewriting — your question is optionally expanded by the LLM for better retrieval
  2. Hybrid retrieval — the top chunks are fetched from both keyword and semantic search (default: 12 chunks)
  3. Optional reranking — a cross-encoder reranker (ms-marco-MiniLM-L-6-v2) re-orders chunks by relevance
  4. Answer generation — the local LLM generates an answer using the retrieved chunks
  5. Citation validation — every claim must cite a provided source, or the answer is rejected and regenerated

Citations

Answers include inline citations like [S1], [S2] that reference specific papers and page ranges. The source panel shows the full context for each citation — click any source to open the PDF at that page.

If FileMind can't find enough evidence to answer part of your question, it explicitly says "insufficient evidence" rather than guessing. Claims without valid citations are never shown.

Tips for Better Results

  • Ask specific questions — "What temperature was used in the sintering process?" works better than "tell me about sintering"
  • Use hybrid search mode for the broadest recall
  • If results seem thin, check that your papers have been fully indexed (look for OCR errors in the dashboard)
  • The AI runs locally via Ollama — larger models generally give better answers but are slower