Configuration
FileMind stores its configuration in a TOML file. All settings have sensible defaults — most users won't need to change anything beyond what the setup wizard configures.
Config File Location
| Platform | Path |
|---|---|
| Windows | %APPDATA%\FileMind\config.toml |
| macOS | ~/Library/Application Support/FileMind/config.toml |
You can also specify a custom path with the --config flag.
Environment Variable Overrides
Any setting can be overridden with an environment variable using the FILEMIND_
prefix and double underscores for nesting:
FILEMIND_LLM__PROVIDER=anthropic
FILEMIND_LLM__API_KEY=sk-ant-...
FILEMIND_RENAME__AUTO_APPROVE_THRESHOLD=0.8 Database
[db]
path = "" # Empty = auto-detect app data directory PDF Ingestion
[ingestion]
metadata_pages = 2 # Pages to extract for metadata (1-10)
text_quality_min_chars = 400 # Min chars for good text quality (50-5000)
text_quality_min_alpha = 0.55 # Min alphabetic character ratio (0.1-0.95)
text_quality_max_replacement = 0.01 # Max replacement character ratio OCR
[ocr]
provider = "paddleocr" # "paddleocr", "null", or "external"
render_dpi = 300 # Page render DPI for OCR (72-600)
paddleocr_lang = "en" # Language code
paddleocr_use_gpu = false # Enable GPU acceleration
paddleocr_use_angle_cls = false # Enable angle classification
paddleocr_preprocess = true # Enable image preprocessing Embeddings
[embeddings]
provider = "sentence-transformers"
model = "all-MiniLM-L6-v2" # Downloaded automatically on first use
dims = 384 # Must match model output dimensions
batch_size = 32 # Chunks per embedding batch Language Model
[llm]
provider = "ollama" # "ollama", "llamacpp", "openai", "anthropic", "gemini"
model = "mistral" # Model name or path
base_url = "" # API base URL override (optional)
api_key = "" # API key for cloud providers
temperature = 0.1 # Sampling temperature (0.0-2.0)
max_retries = 2 # Retry count on parse failure
timeout_seconds = 120 # Per-request timeout (5-600) For cloud providers, use the provider-specific key fields:
[llm]
provider = "anthropic"
model = "claude-sonnet-4-20250514"
anthropic_api_key = "sk-ant-..."
# Or for OpenAI:
provider = "openai"
model = "gpt-4o"
api_key = "sk-..."
# Or for Gemini:
provider = "gemini"
model = "gemini-2.0-flash"
gemini_api_key = "..." OpenAlex Integration
[openalex]
enabled = true # Enable title matching against OpenAlex
api_key = "" # Optional API key
mailto = "" # Contact email (recommended for higher rate limits)
timeout_seconds = 10 # API timeout (1-60)
title_match_threshold = 0.85 # Minimum match confidence (0.5-1.0) Rename Settings
[rename]
template = "default" # Template style
title_max_chars = 50 # Max title portion in filename (20-200)
filename_max_chars = 160 # Max total filename length (60-255)
auto_approve_threshold = 0.75 # Confidence for auto-approve
propose_threshold = 0.55 # Minimum confidence to propose
The auto_approve_threshold must be greater than propose_threshold.
Search & RAG
[search]
default_mode = "hybrid" # "hybrid", "fts", or "semantic"
default_limit = 40 # Default results per query
semantic_min_score = 0.15 # Minimum embedding similarity
rag_top_k = 12 # Chunks to retrieve for RAG
rag_min_score = 0.3 # Minimum chunk score for RAG
reranker_enabled = true # Enable cross-encoder reranking
reranker_model = "cross-encoder/ms-marco-MiniLM-L-6-v2" Background Jobs
[jobs]
max_concurrent = 2 # Max parallel jobs (1-8)
scan_batch_size = 50 # Files per scan batch
embed_batch_size = 32 # Chunks per embedding batch Service
[service]
host = "127.0.0.1" # Always localhost for security
port = 0 # 0 = auto-select free port
workers = 1 # Uvicorn workers
log_level = "info" # "debug", "info", "warning", "error" Zotero
See Export & Integration for Zotero configuration.