A2rchi · LLM Copilot for CMS Ops
Partnering with MIT to build an LLM powered RAG tool for CMS computing operations.
A2rchi · LLM Copilot for CMS Ops
A2rchi is a collaboration between MIT and CMS to help operators access the vast amount of documentation and logs behind CERN’s systems. I lead the development for CMS computing operations, handling the data pipelines, backend, and frontend.
What I Did
- Data Ingestion: Built a crawler that authenticates through CERN SSO to index internal portals, JIRA tickets, and logbooks.
- Processing Pipeline: Created a pipeline to clean and normalize data (Markdown, HTML, ticket metadata) for the retrieval system.
Why It’s Useful
- Faster Debugging: Helps operators find recurring issues quickly without searching through multiple tools.
- Shared Knowledge: Makes it easier for shifters and coordinators to access the same information.
- Future Proofing: Lays the groundwork for automated fixes based on the indexed knowledge base.
