To manage the settings of your chosen knowledge base, first select it, then navigate to the [Settings] tab. Here, you can adjust the name, description, indexing method, embedding model, and retrieval settings to optimize the performance and accessibility of your knowledge base.
Knowledge Base Name
Assign a unique name to differentiate this knowledge base from others, allowing for seamless navigation and management.
Knowledge Description
Provide a clear description of the content and purpose of the documents housed within the knowledge base, ensuring users understand what information is available.
Indexing Methods
-
High-Quality Mode: This mode employs a customizable Embedding model, converting text chunks into numerical vectors for efficient data compression and storage, while also optimizing user interaction accuracy with LLMs.
-
Economical Mode: Utilizing an offline vector engine with keyword indexing, this mode minimizes operational costs by eliminating the need for additional tokens, albeit with reduced search precision. This method supports inverted indexing only.
Embedding Model
You have the option to change the embedding model used in the knowledge base. Switching models will prompt a re-embedding of all documents, with previous embeddings being erased to maintain data integrity.
Retrieval Settings
This involves executing both full-text and vector searches, followed by a reordering process to pinpoint the results that best align with the user's query.
Weight Settings
Customize the balance between semantic and keyword priorities to align search functionalities with organizational needs. Full-text (keyword) search ensures precision when specific terms are known, while semantic search applies vector distances for relevance, particularly useful in multilingual contexts.
-
Semantic Value of 1: Engage only semantic searches, facilitating deep content retrieval even when queries don't fully match the knowledge base terms.
-
Keyword Value of 1: Activate solely keyword search for precise, expedited searches, ideal for large databases or when specific terms are known.
-
Custom Weights: Besides single-mode options, set tailored weights to find the optimal balance between keyword and semantic searching that best serves your business objectives.
TopK
Define the number of text chunks retrieved based on similarity to the user's question. A default value of 3 is used, with higher values broadening the scope of retrieved text segments.
Score Threshold
Establish a similarity score threshold, defaulting to 0.5, to filter out less relevant text chunks. Raising the score threshold results in fewer, but potentially more relevant, retrieved segments.