Skip to main content

Indexing settings

Indexing the data loaded into the knowledge base consists of several stages:

  1. Data processing: converting text into MD format, which is used for training.
  2. Chunking: dividing text into fragments (chunks).
  3. Vectorisation: converting the chunks into vector representations (embeddings).

In the SettingsIndexing tab, you can change the parameters for chunking and vectorisation.

Vectorisation

The Vectoriser model parameter determines the language model for text vectorisation. This model will vectorise both your data and user queries:

  • text-embedding-3-large: a model by OpenAI. When using the model, your data is sent to their server.
  • intfloat/multilingual-e5-large: a model hosted on Tovie AI’s servers.

Chunking

Chunking method

The Chunking method parameter determines how the text will be split into chunks:

  • By length: the text will be chunked by length, considering word boundaries.
  • Using LLM: the text will be chunked using a language model. In this case, the chunks will consider the text hierarchy: headings, paragraphs, section and document titles.

The list of settings depends on the selected chunking method.

  • Max chunk size in characters.

    How the text will be chunked
      Suppose the *Max chunk size in characters* setting is 70.
    You have a text consisting of 2 sentences, 100 characters each.

    The text will be divided into 3 chunks:

    1. 70 characters from the first sentence.
    2. The remaining 30 characters from the first sentence and 40 from the second one.
    3. The remaining 60 characters from the second sentence.
  • Language: the language of the source documents. This setting helps chunk the text correctly. If your sources are in several languages, select the one most used in queries to the knowledge base.

LLM settings

The LLM settings are used to obtain image descriptions and, in the case of LLM-based chunking, to generate chunks as well.

  • Model: select one of the available language models.
  • Max tokens in request: limits the number of tokens that can be sent to the LLM.
  • Max tokens in response: limits the number of tokens that the LLM can generate in one iteration.
  • Temperature adjusts the creativity level of responses. Higher temperature values produce more creative and less predictable results.
tip

To see how your source is chunked, download the archive with chunks:

  1. Go to the Sources section and hover over the desired source.
  2. Click Chunk archive.

When testing the knowledge base, you also can see which chunks are selected to generate the response.