Retrieval Augmented Generation

Research Paper

Retrieval augmented generation is a technique where you retrieve relevant information, e.g., from a vector index, and then add it to the prompt of a language model. Additional instructions can help reduce the hallucination of language models and keep their answers focussed on the provided information.

Retrieval augmented generation consists of two steps:

Retrieving relevant information
Generating a response using a prompt that contains the retrieved information

Example

Source Code

const chunks = await retrieve(
  new VectorIndexRetriever({
    // some vector index that contains the information:
    vectorIndex,
    // use the same embedding model that was used when adding information:
    embeddingModel: openai.TextEmbedder({
      model: "text-embedding-ada-002",
    }),
    // you need to experiment with these setting for your use case:
    maxResults: 3,
    similarityThreshold: 0.8,
  }),
  question
);

Generate an answer from the retrieved information:

const answer = await generateText({
  model: openai.ChatTextGenerator({
    model: "gpt-4",
    temperature: 0, // remove randomness as much as possible
    maxGenerationTokens: 500,
  }),

  prompt: [
    openai.ChatMessage.system(
      [
        // Instruct the model on how to answer:
        `Answer the user's question using only the provided information.`,
        // To reduce hallucination, it is important to give the model an answer
        // that it can use when the information is not sufficient:
        `If the user's question cannot be answered using the provided information, ` +
          `respond with "I don't know".`,
      ].join("\n")
    ),
    openai.ChatMessage.user(`## QUESTION\n${question}`),
    openai.ChatMessage.user(`## INFORMATION\n${JSON.stringify(chunks)}`),
  ],
});

Retrieval Augmented Generation

Example​

Retrieve related information from a vector index:​

Generate an answer from the retrieved information:​

Example

Retrieve related information from a vector index:

Generate an answer from the retrieved information: