Embedding Models in RAG Systems

KEEP IN TOUCH | THE GEN AI SERIES

Rahul S
3 min readJan 29, 2024

--

The choice of our embedding model has a significant impact on the overall relevance and usability of our RAG application. As such, it requires a nuanced understanding of each model’s capabilities and an analysis of how those capabilities align with our application’s requirements.

My Approach to Choosing a general Embedding Model

  1. Similarity Performance: I assess how well embeddings align with my task requirements, relying on metrics like the MTEB (Mean Task Embedding Benchmark) score. This benchmark provides a comprehensive evaluation of text embedding models, ensuring their overall performance meets the specific needs of my use case.
  2. Model Size: I consider the operational costs associated with storing and loading the model into memory. There’s a trade-off between a larger, more complex model offering enhanced performance and a smaller model with reduced storage and computational requirements. My decision hinges on finding the right balance that aligns with my performance goals and resource constraints.
  3. Sequence Length [context vector size]: Determining the maximum number of input tokens the model supports is crucial, especially for tasks involving sequential data like natural language processing. I aim for an optimal sequence length

--

--