Main stages of Tovie Data Agent operation

The operation of Tovie Data Agent is based on the Retrieval-Augmented Generation (RAG), which combines search and response generation for user queries. RAG integrates the internal knowledge of LLM with extensive and continually updated external data sources, such as databases, websites, and other information resources.

RAG improves the accuracy and reliability of response generation, especially for tasks requiring deep knowledge. The approach also enables continual updating of knowledge and integration of domain-specific information.

Based on the project settings, one of the following pipelines is executed.

Semantic pipeline
Agentic pipeline

Index knowledge base

Various data sources are used for indexing the knowledge base: your files in DOCX, PDF, TXT formats, etc., as well as external knowledge storage services like Confluence. Source data are pre-processed, resulting in MD text files, which are then submitted for chunking and vectorisation.

Chunking

Chunking is the division of text into small fragments (chunks) to improve search quality. When searching and preparing an answer, the system uses these chunks instead of or alongside entire text documents (depending on settings).

Vectorisation

Vectorisation is the transformation of text into a vector representation. Each chunk is converted into a vector, which is then used to find the most relevant answer to the user query. Vectorisation enables the use of geometric and algebraic operations for data comparison and analysis.

Vectorised chunks (embeddings) are stored in a vector storage optimised for fast search and data access.

Retrieval-Augmented Generation (RAG)

Semantic pipeline
Agentic pipeline

Retrieving

The user query is rephrased considering the chat history. The rephrased query, just like the data chunks, is converted into a vector representation. Information retrieval is performed by comparing the vector representations of the user query and the data. The vector storage automatically chooses the most relevant data based on vector similarity.

Re-ranking

Once the vector storage has found the most relevant chunks, they are re-ranked by a special model called a re-ranker. As a result, the relevance score of each chunk may change compared to the score provided by the vector storage.

Response generation

At the final stage, the system generates a response to the user’s query. The system submits the query and the chunks selected in the previous stages to the LLM. The LLM then generates a response using the information provided as well as its own internal knowledge.

Index knowledge base​

Chunking​

Vectorisation​

Retrieval-Augmented Generation (RAG)​

Retrieving​

Re-ranking​

Response generation​