Archi · LLM Copilot for CMS Ops

Partnering with MIT to build an LLM-powered RAG copilot for CMS computing operations.

Archi · LLM Copilot for CMS Ops

Archi is a collaboration between MIT and CMS to help operators access the vast amount of documentation and logs behind CERN’s systems. I lead the development for CMS computing operations, handling the data pipelines, backend, and frontend.

Source: github.com/archi-physics/archi

What I Did

  • Data Ingestion: Built a crawler that authenticates through CERN SSO to index internal portals, JIRA tickets, and logbooks.
  • Processing Pipeline: Created a pipeline to clean and normalize data (Markdown, HTML, ticket metadata) for the retrieval system.

Why It’s Useful

  • Faster Debugging: Helps operators find recurring issues quickly without searching through multiple tools.
  • Shared Knowledge: Makes it easier for shifters and coordinators to access the same information.
  • Future Proofing: Lays the groundwork for automated fixes based on the indexed knowledge base.