Beyond Functionality: Navigating NFRs in a RAG System

CryptoGPT
10 min readDec 10, 2024

--

In the previous blog, we learned that Retrieval augmented generation (RAG) is a powerful technique. It combines neural text generation with information retrieval. It allows a model to dynamically retrieve relevant documents from a large corpus. Once retrieved, it uses them as context for generating natural language responses.

Yet, designing and implementing a RAG system is not a trivial task. It involves many challenges and trade-offs that affect the system’s quality and performance. Functional requirements specify what the system should do. In addition, there are non-functional requirements (NFRs). These requirements define how the system should behave and perform under various conditions. NFRs are the backbone of system architecture. They ensure that a RAG system doesn’t just function, but flourishes under diverse conditions. As we architect RAG solutions, understanding NFRs is non-negotiable. In this blog, we will explore the critical non-functional requirements that need to be considered while architecting a RAG solution. We will discuss how they impact the system’s design, development, and the evaluation process. So let us get going.

What are Non-Functional Requirements?

To begin, let’s understand Non-Functional Requirements (NFRs). NFRs are specifications that describe how a system works. They define its performance under different conditions. Functional requirements explain what a system should do. Non-functional requirements focus on how a system performs tasks and maintains quality. They include aspects like speed, reliability, scalability, security, and usability. NFRs are important for ensuring user satisfaction, system efficiency, and overall effectiveness. They determine how the system behaves in areas like response time, data capacity, data security, and ease of use. By setting standards in these areas, NFRs play a crucial role in the system’s design, development, and evaluation process.

Categories of NFRs for the RAG Pattern

The RAG pattern can be considered to have two main components: the body and the mind. The body is the overall system that implements the RAG technique, including the retrieval, generation, and fusion modules. The mind is the language model that performs the natural language tasks, such as question answering, dialogue, or summarization.

There are two categories of non-functional requirements (NFRs) that need to be satisfied for the RAG system to work effectively and efficiently. One focuses on the body, I.e., the overall system. The other focuses on the mind, i.e., the language model. These NFRs are not about what the system does, but how it does it.

As such, there are two categories of NFRs for a RAG pattern:

  1. Overall System NFRs: The NFRs for the body are the system NFRs. They cover aspects such as performance, reliability, scalability, security, and usability. These NFRs are essential for ensuring that the system functions correctly and meets the technical and user expectations. For example, the system should have a fast response time, handle large amounts of data, protect the data from unauthorized access, and provide a user-friendly interface.
  2. Language Model NFRs: The NFRs for the mind are the language model NFRs. They focus on aspects such as the groundedness, accuracy, and relevance of the responses generated by the language model. These NFRs are crucial for ensuring that the model produces high-quality and meaningful natural language outputs. For example, the responses should be based on factual and reliable information, answer the query correctly and completely, and match the context and intent of the query.

Let us explore the dimensions of each of these categories.

Overall System NFRs

As we had discussed before, system NFRs are the bedrock of the RAG ‘body.’ They ensure high performance, reliability, scalability, security, and user-friendliness. This allows the system to meet and exceed technical and user expectations.

The subsequent diagram shows the seven dimensions and its corresponding metrics of the system NFRs. NFR metrics are the measurable standards. They quantify how well the system meets its non-functional objectives.

Each of these dimensions is a pillar that upholds the structure of a reliable, secure, and user-friendly RAG system. Let us drill down into each of these dimensions and the metrics for each of these NFR dimensions.

Performance

This dimension measures how fast and accurate the RAG system can process the input query and generate the output response. Performance depends on several factors. These include the size and complexity of the language model, the number, and quality of the retrieved documents, the encoding and decoding strategies, and the tokens consumed by the LLM available. A high-performance RAG system should be able to handle a large volume of queries and produce responses that are fluent, coherent, and informative.

The subsequent table provides a snapshot of the metrics used to measure this NFR category.

Scalability

This dimension measures how well the RAG system can cope with increasing demand and complexity. Scalability depends on the architecture and design of the RAG system. Factors include the choice of the document retriever, indexing and storage methods, parallelization and distribution techniques, and load balancing and fault tolerance mechanisms. A scalable RAG system should be able to handle a growing number of users and queries without compromising the performance or quality of the responses.

The subsequent table provides a snapshot of the metrics used to measure this NFR category.

Availability

This dimension measures how reliable and resilient the RAG system is. Availability depends on the robustness and redundancy of the RAG system, such as the error handling and recovery procedures, the backup and restore policies, the monitoring and alerting systems, and the security and privacy safeguards. An available RAG system should be able to operate continuously and consistently without interruptions or failures.

The subsequent table provides a snapshot of the metrics used to measure this NFR category.

Security

This dimension measures how safe and trustworthy the RAG system is. Security depends on the protection and control of the RAG system, such as the authentication and authorization mechanisms, the encryption and decryption methods, the data privacy and integrity checks, and the compliance and audit standards. A secure RAG system should be able to prevent unauthorized access and manipulation of the system and the data, and ensure the confidentiality and accountability of the users and the responses.

The subsequent table provides a snapshot of the metrics used to measure this NFR category.

Usability

This dimension measures how easy and convenient the RAG system is for the users. Usability depends on the user interface and experience of the RAG system, such as the input and output formats, the feedback and guidance features, the customization and personalization options, and the accessibility and compatibility standards. A usable RAG system should be able to provide a clear and intuitive way for the users to interact with the system and receive the responses that meet their needs and expectations.

The subsequent table provides a snapshot of the metrics used to measure this NFR category.

Data Consistency

This dimension measures how accurate and up-to-date the data used by the RAG system is. Data consistency depends on the source and quality of the data, such as the document collection, the knowledge base, the language model, and the query and response logs. A data consistent RAG system should be able to ensure that the data is valid, complete, and current, and that the responses reflect the latest and most relevant information available.

The subsequent table provides a snapshot of the metrics used to measure this NFR category.

Cost-Effectiveness

This dimension measures how efficient and economical the RAG system is. Cost-effectiveness depends on the resource utilization and optimization of the RAG system, such as the hardware and software requirements, the energy and bandwidth consumption, the maintenance and update costs, and the return on investment and value proposition. A cost-effective RAG system should be able to provide a high-quality service at a low cost, and generate a positive impact and benefit for the users and the stakeholders.

The subsequent table provides a snapshot of the metrics used to measure this NFR category.

Language Model (LLM) NFRs

RAG uses a pre-trained language model (LM) to encode the input query and retrieved documents, and decode the output response. The responses depend on the capabilities and characteristics of the LM. So, we need to consider the non-functional requirements (NFRs) for the LM that affect the RAG system.

The diagram below shows the five dimensions and metrics of the system NFRs.

As usual, let us drill down into each of these dimensions and the metrics for each of these NFR dimensions.

Content Authenticity

This dimension measures how original and credible the content generated by the LM is. Content authenticity depends on the training data and the generation methods of the LM, such as the data sources, the data filtering, the data augmentation, the sampling strategies, and the decoding algorithms. A content authentic LM should be able to generate responses that are not plagiarized, fabricated, or misleading, and that are consistent with the facts and the evidence.

The subsequent table provides a snapshot of the metrics used to measure this NFR category.

Information Integrity

This dimension measures how accurate and up-to-date the information used and generated by the LM is. Information integrity depends on the knowledge and the reasoning abilities of the LM, such as the factual knowledge, the commonsense knowledge, the world knowledge, the inference, the deduction, and the induction. An information-integrity LM should be able to use and generate information that is not incorrect, outdated, or incomplete, and that is verified and validated.

The subsequent table provides a snapshot of the metrics used to measure this NFR category.

Contextual Alignment

This dimension measures how well the LM’s responses fit the user’s query. We use Relevance Scores and Cosine Similarity to guide us. This assures that the responses are accurate, relevant, and meaningful to users.

The subsequent table provides a snapshot of the metrics used to measure this NFR category.

Response Quality

This dimension measures how fluent and informative the content generated by the LM is. Response quality depends on the linguistic and the semantic skills of the LM, such as the vocabulary, the grammar, the syntax, the style, the tone, the clarity, the specificity, and the completeness. It’s about the consistent caliber of the LM’s responses. Consistency Rate and Novelty Score highlight this dimension’s focus on delivering responses that are both uniform in quality and rich in variety, avoiding the pitfall of predictable or monotonous content.

The subsequent table provides a snapshot of the metrics used to measure this NFR category.

Fairness and Diversity

The dimension measures how the LM creates unbiased and diverse content. Fairness and diversity depend on the LM’s ethical and social values, like fairness, diversity, inclusivity, sensitivity, empathy, and creativity. A fair and diverse LM should generate content that is respectful and representative of different perspectives and backgrounds. The content should not be discriminatory, offensive, or harmful.

The subsequent table provides a snapshot of the metrics used to measure this NFR category.

Conclusion

In this blog, we have discussed the non-functional requirements (NFRs) for building and deploying a Retrieval-Augmented Generation (RAG) system. We have identified and explained two types of NFRs: system NFRs and language model NFRs. System NFRs cover the dimensions of performance, scalability, availability, security, usability, data consistency, and cost-effectiveness. Language model NFRs cover the dimensions of content authenticity, contextual alignment, response quality, information integrity, and fairness and diversity. We have also discussed the metrics and methods for evaluating and improving each of these dimensions for a RAG system.

I hope that this blog has provided you with a comprehensive and practical guide for designing and developing a RAG system that meets the expectations and needs of the users and the stakeholders. RAG is a promising and exciting technique that can enable a new generation of intelligent and interactive applications that can answer any question and generate any text. However, it also poses many challenges and risks that require careful consideration and mitigation. By following the NFRs presented in this blog, you can ensure that your RAG system is not only functional, but also high-quality, reliable, secure, user-friendly, consistent, and cost-effective.

--

--

CryptoGPT
CryptoGPT

Written by CryptoGPT

Creating impact through Technology | #CTO at #Microsoft| Data & AI Strategy | Cloud Computing | Design Thinking | Blogger | Public Speaker | Published Author

No responses yet