About this project
ACL 2026 Paper Explorer is a fast way to explore the 4,872 papers presented at the 64th Annual Meeting of the Association for Computational Linguistics (July 5–7, 2026). Conference programs this large are impossible to skim — the idea here is simple: type what you care about in plain language, get the most relevant papers instantly, star the ones you want to see, and walk out with a session-by-session itinerary in your calendar.
Coverage note: this tool includes Main Conference and Findings papers only — workshop papers are not part of the dataset.
How search works
- Hybrid retrieval. Every query runs two searches in parallel: classic keyword matching
(BM25) and semantic similarity over OpenAI
text-embedding-3-smallvectors of each paper's title and abstract. The two rankings are merged with Reciprocal Rank Fusion (RRF, k=60), so exact-term matches and "means the same thing" matches both surface. - Precomputed embeddings. All paper embeddings are generated once, offline, and cached — a search only embeds your query (one small API call), which keeps results fast and cheap.
- Find similar. From any paper you can jump to its nearest neighbors by embedding similarity — handy for building a themed session plan.
The AI assistant
"Ask AI" is retrieval-augmented generation: the top matching papers are handed to an OpenAI model as context, and it answers with numbered citations. Every citation links straight to the paper, and the raw retrieved results always appear alongside the answer — so you can verify or dig deeper yourself. All searches and completions are traced with Opik for observability.
Planning your conference
- Full Schedule — browse all sessions by day, filter by presentation mode or keywords.
- My Schedule — your starred papers organized by day and session, exportable as ICS files or added to Google Calendar directly. Stars are stored only in your browser; nothing is tracked.
- Paper links — every paper links out to the ACL Anthology and Google Scholar.
The stack
Deliberately simple: a single Flask app with vanilla JavaScript and CSS — no build step, no database. Papers, embeddings, and the BM25 index all live in memory, loaded from a preprocessed cache at startup. It deploys as one gunicorn process behind nginx.