Bullaki
GraphRAG Developer Challenge
Job Location
Italy, Italy
Job Description
GraphRAG Developer Challenge – Legal Document Processing (Prototype) Role: Senior RAG Systems Developer (Contract/Freelance) Compensation: $600 — paid only if you pass (>95% benchmark) Timeline: 3–5 days from materials receipt to live demo Purpose: Technical evaluation for potential long-term hire Frontend/UI: None (backend prototype only) Contact: s.lilliu@revortis.com Objective We’re seeking an expert in graph-based retrieval (GraphRAG) to build a high-accuracy prototype for legal document reasoning. This is a paid technical test that may lead to a long-term position. The goal is a true GraphRAG system featuring explicit knowledge-graph construction and traversal, multi-hop reasoning, agentic orchestration, and strong focus on retrieval accuracy and explainability. Materials Provided /docs/ → Pre-processed Markdown legal documents with metadata /sample_questions.json → Example question format /sample_answers_rag.json → Example answer format Download materials: https://drive.google.com/drive/folders/19ZQ6cZDIe3stu0DXbXoG6p-AEQwZWdCt?usp=drive_link (Benchmark uses unseen questions.) Deliverable Implement two functions in Python 3.12 (Poetry project): def ingest(document_paths: List[str]) -> None: """Ingest Markdown docs and build the knowledge graph.""" def query(questions: List[str]) -> List[str]: """Return answers with Vancouver-style citations grounded in retrieved sources.""" Requirements: No UI, no API keys provided. Any stack may be used. query() must support parallel execution (~400 questions in ≤60 min) and show a progress indicator. Test thoroughly for correctness and performance before the demo. Live Demo In a 60-minute live session you will: Receive ~400 unseen questions. Run query() to produce /answers.json. Explain your architecture: how the graph is built, traversed, and used to generate grounded answers. Only the developer(s) who wrote the code may present. Evaluation Passing requires an overall score above 95%, measured by (LLM as a judge): Faithfulness (grounded, no hallucinations), Relevance (retrieval matches intent), Completeness (covers key legal points), and Clarity (structured, legally coherent writing). Payment & Next Steps If you pass, you’ll receive $600 USD after verification of reproducibility and hand-over of the repo (codebase, Poetry lock, run instructions, brief tech note). Top performers may be invited to interview for a long-term paid role. Failure to pass or complete within 60 minutes = no payment (you keep your code). Key Priorities Parallelization, graph-based reasoning, correctness, and explainability. Note for Agencies We will not engage in pre-contract discussions with agencies. If an agency wishes to propose a developer, communication will proceed only after that developer passes the benchmark test. This ensures time efficiency and direct technical validation.
Location: Italy, IT
Posted Date: 11/7/2025
Location: Italy, IT
Posted Date: 11/7/2025
Contact Information
| Contact | Human Resources Bullaki |
|---|