FileMind

Rename Workflow

FileMind transforms chaotic PDF filenames into clean, consistent names like Vaswani_2017_AttentionIsAllYouNeed.pdf. Every proposal is backed by evidence — the exact page and text snippet that justified it. You always have the final say.

How Proposals Are Generated

FileMind uses a cascading extraction strategy — it tries the most reliable method first and only falls back to less certain methods when needed:

  1. DOI / arXiv lookup — if a DOI or arXiv ID is found in the PDF, metadata is fetched from CrossRef or the arXiv API. This is the most reliable source.
  2. Heuristic parsing — extracts title, authors, and year from reading-order text using position and formatting heuristics.
  3. OpenAlex matching — optionally matches the extracted title against the OpenAlex academic database for validation.
  4. LLM fallback — if all deterministic methods fail, the local AI extracts structured metadata from the PDF text. This is a last resort.

The LLM never generates filenames directly. Filenames are always composed from a deterministic template using validated metadata fields.

Confidence Tiers

Every proposal gets a confidence score (0-1) based on the quality of extracted metadata. The score accounts for whether a DOI was found, whether the year is plausible, whether the title passes quality checks, and more.

TierScoreBehavior
High ≥ 0.75 Eligible for bulk approve. Can be auto-applied if you choose.
Medium 0.55 – 0.74 Queued for individual review. Requires explicit approval per item.
Needs Review < 0.55 Routed to manual review. Never auto-proposed. May need manual metadata correction.

Both thresholds are configurable — see Configuration.

Filename Template

The default template produces filenames like:

  • Vaswani_2017_AttentionIsAllYouNeed.pdf (single author)
  • Vaswani_et_al_2017_AttentionIsAllYouNeed.pdf (multiple authors)

Processing rules:

  • Title: leading articles stripped, special characters removed, title-cased, truncated at word boundary (50 chars by default)
  • Author: last name extracted, diacritics normalized to ASCII, common prefixes handled (van, von, de, etc.)
  • Year: 4-digit, validated against plausible range (1900 to current year + 1)
  • Collision handling: appends _(2), _(3) if a filename already exists
  • Max total length: 160 characters including .pdf

Reviewing Proposals

The Rename Queue shows all pending proposals with the original filename, proposed filename, confidence badge, and a quick metadata preview. For each proposal you can:

  • Approve — accept the proposed rename
  • Reject — decline the proposal (the file keeps its current name)
  • Inspect evidence — open the evidence drawer to see the exact page, text snippet, and confidence breakdown

Use Bulk Approve to approve all high-confidence proposals at once.

Applying Renames

Click Apply Batch to rename all approved proposals as an atomic batch. Every rename is recorded in the action journal before the file is moved, so the operation is fully recoverable.

Rolling Back

Click Undo Batch to reverse any applied batch. Rollback restores every file to its previous name. The action journal keeps a complete audit trail of all operations, so you can always see what was changed and when.

If FileMind is interrupted during a batch (e.g., power loss), it recovers automatically on next launch using the journal.