Enhance RAG Context Recall by 95% Using Adaptive Embedding Model | Vignesh Baskaran | Oct, 2024

SeniorTechInfo
1 Min Read

Step-by-step model adaptation code and results attached

Retrieval-augmented generation (RAG) is one prominent technique employed to integrate LLM into business use cases, allowing proprietary knowledge to be infused into LLM. This post assumes you already possess knowledge about RAG and you are here to improve your RAG accuracy.

Let’s review the process briefly. The RAG model consists of two main steps: retrieval and generation. In the retrieval step, several sub-steps are involved, including converting context text to vectors, indexing the context vector, retrieving the context vector for the user query, and reranking the context vector. Once the contexts for the query are retrieved, we move on to the generation stage. During the generation stage, the contexts are combined with prompts and sent to the LLM to generate a response. Before sending to the LLM, the context-infused prompts may undergo caching and routing steps to optimize efficiency.

For each of the pipeline steps, we will conduct numerous experiments to collectively enhance RAG accuracy. You can refer to the below image that lists(but is not limited to) the experiments performed in each step.

Experiments Image

Share This Article
Leave a comment

Leave a Reply

Your email address will not be published. Required fields are marked *