A generation prompt is typically three parts: 1. System instruction — "answer only from the context; cite sources; say you don't know if it's not there" 2. The retrieved chunks — each tagged with an id 3. The user question Tag chunks so the model can cite them: [#1], [source: report > BRCA1].
Instructing the model to ground strictly in context and to abstain when coverage is poor is the single biggest lever against hallucination. Keep the most relevant chunk near the question — models weight recent/edge context.