Your terminal to the Marathon universe.
LOCUS is a self-hosted Discord bot built for the Marathon community. It aggregates intelligence from Bungie's news feed, the Marathon Fandom wiki, a community knowledge base, Bungie's help centre, and Reddit — then makes all of it instantly queryable from within any Discord server.
At its core Locus is a hybrid BM25 + semantic retrieval system backed by a local LLM. Ask it anything about Marathon: lore, gameplay mechanics, patch notes, service outages. It answers from indexed sources, cites them inline, and never fabricates beyond what the corpus contains.
Ask any question about Marathon. BM25 keyword search is fused with vector cosine similarity via Reciprocal Rank Fusion to surface the most relevant context. A local Qwen 2.5 model synthesizes a cited answer — no external API calls, no fabrication beyond the indexed corpus.
/ask <question>Monitors the Bungie Platform news API on a 15-minute poll and auto-posts Marathon-tagged articles to subscribed channels the moment new content drops. Fetch the latest manually at any time without a subscription.
/news latest · /news subscribePolls Bungie's GlobalAlerts endpoint every 5 minutes. When a new alert appears or an existing one resolves, subscribed channels receive an immediate notification — no manual status page checking required.
/alerts current · /alerts subscribeFull search and retrieval against a BookStack-backed knowledge base. Community members can propose new entries directly from Discord via a modal — submissions land in a proposals book for admin review before publishing.
/kb search · /kb show · /kb proposeSearches and renders pages from the Marathon Fandom wiki directly in Discord. Handles template-heavy pages by server-rendering HTML and stripping MediaWiki artifacts, producing clean readable prose every time.
/wiki search · /wiki showWrap any term in double brackets and the bot resolves it against the knowledge base and wiki, replying with the first match. No slash command required — works naturally inside any channel message.
[[term]] inline trigger/ask <q>
Hybrid BM25 + vector retrieval over all indexed sources. Answer synthesized by qwen2.5:3b with inline [N] citations mapped to source embeds.
/corpus
Show the last reindex timestamp and per-source chunk counts.
/kb search / show / propose
BookStack-backed knowledge base. propose opens a Discord modal to draft a new entry for admin review.
/kb proposals subscribe / unsubscribe
Admin: route new KB proposal drafts to a specific channel for review.
/wiki search / show
Search the Marathon Fandom wiki and render article content inline in Discord.
/news latest / subscribe / unsubscribe
Bungie News API feed. Subscribed channels receive automatic posts on a 15-minute poll interval.
/alerts current / subscribe / unsubscribe
Bungie GlobalAlerts. 5-minute poll — notifies on new and resolved alerts in real time.
/help
Full in-Discord command guide with source descriptions and usage notes.
[[term]]
Glossary inline trigger — replies with the first KB or wiki match for the wrapped term. Works in any channel message.
Retrieval without hallucination.
Every answer traces back to a real source. The pipeline indexes five independent data streams into a hybrid FTS5 full-text + float32 embedding store, then fuses keyword and semantic rankings at query time before passing the top context to a quantized local LLM — all running on-premises, no third-party AI services involved.
Documents are split into 500-character sentence-aware chunks with 80-character overlap. Each chunk is embedded via nomic-embed-text (768-dimensional, L2-normalized) and stored alongside a SQLite FTS5 full-text index. The corpus is rebuilt twice daily and at startup if the index is empty.
Queries run BM25 keyword search (top-30) and cosine similarity over all embeddings (top-30, brute-force numpy) in parallel. Results are fused via Reciprocal Rank Fusion (k=60), producing a single ranked list that rewards both lexical and semantic relevance. Falls back to BM25-only if the embedding model is unreachable.
The top-5 fused chunks are assembled into a numbered context block and sent to qwen2.5:3b-instruct-q4_K_M with a strict citation system prompt (temp 0.2). The response includes inline [N] citations mapped back to the original source URLs as Discord embeds — making every claim verifiable.