Skip to main content
< Back
Print

Vector database

What are vectors

Vectors enhance the Data Hub by generating vector embeddings for entries in a collection. Once configured, collections can be searched using an AI-powered, natural language search, which supports both keyword-based and semantic, full-text queries.

When vectors are enabled, a vector embedding representing the content of each record is created. The same is done with search queries. When a search is performed, the vector embedding of each record is compared to that of the search query to create a score. This enables the records to be ranked, with the most relevant results shown first.

Navigation to the Vectors page

To get to the Vectors page from the dashboard, find the Data Hub Tile and select Vectors.
If the tile shown is deactivated, you do not yet have a license. In this case, please contact sawsconnector@saws.de.

If you are not already on the dashboard, click the leftmost button in the toolbar to navigate to the dashboard page.

If you are already in the Data Hub, you can select the “API Keys” option from the dropdown menu in the toolbar.

Alternatively, you may navigate to the settings of a Data Hub collection and select the Vector tab.

Creating a vectorization

To create a new vectorization of your collection, click on the plus button in the top-left corner of the vector table.

This will open an editor with the following settings:

  • Name
  • Data collection
  • List of vectors
  • Active
  • Description

Fields

Name

The name should be descriptive and unique.

Data Collection

The data collection that should be vectorized. This may already be filled in if you navigated to the page via the data collection settings.

List of vectors

The list of vectors to be created for each record. Use the plus button to add new vectors and select the settings as explained below.

Name

Name for one vector. It should be descriptive and unique.

Assignment

Select the AI assignment to be used for vectorization. The edit button allows you to edit the selected assignment or create a new one if none has been selected yet. Only assignment configurations with the purpose Vectorization can be selected. See the AI assignment article for details on assignments.

Content

This content provides a template showing how the record is vectorized. Using Markdown for the template is recommended, but not required. To include fields from the record, use placeholder variables ${record.<field-name>}. If some of the fields contain HTML content you can convert them to Markdown using the snippet ${record.<field-name>|markdown}, which can help to save on tokens.

On the right is a Markdown preview of the template for one record. If you are using prompts, you can also show a preview of the preprocessed content.

Prompt

Before vectorizing the content, you have the option of of pre-processing it using an AI. You can either use one of the prompt presets or create your own. Each prompt configuration contains two prompts:

  • Pre Store Prompt: Used to pre-process the record content before vectorizing.
  • Pre Search Pompt: Used to pre-process user search queries.

See below for more details on prompts.

Weight

If multiple vectors are configured and used during a search, each one can be given a different weighting.

Active

If active, new records will be vectorized automatically.

Description

You may provide a description for your own use.

Once everything is configured you may save the Vector config and return to the previous page.

Overview

Once you return to the main vector page, you will see an overview of all your vector configurations. You can delete unused configurations, or edit them which will bring you back to the edit page. Once a vector config has been saved and is active, all new records will be vectorized automatically. However, existing records are not automatically vectorized and need to be synced manually by pressing the sync button. Check the “Synced entries” column to see if a collection has been fully synchronized.

Searching with vectorization

To use the newly configured vectorization when searching a configured Export endpoint is required. When querying the endpoint  you need to provide an additional vector field, as shown below (for more details visit Using an Export Endpoint). The text field contains the search query, while the optional vectorName contains the vector name. If no vectorName is specified, all vectors will be used with the configured weights.

{
  "limit": 10,
  "start": 0,
  "filter": {},
  "vector": {
    "text": "query",
    "vectorName": "summarize"
  }
} 

Prompt

Prompts can be created and edited directly from the vector config page like shown above, but can also be accessed on its own prompts page. To navigate you need to be already in the Data Hub and click the dropdown menu to select Prompts.

To create a new prompt, click on the Plus button in the top-left corner of the prompt table.

This will open an editor with the following settings:

  • Name
  • Pre Store Prompt
  • Pre Search Prompt
  • Description

Name

The name should be descriptive and unique.

Pre Store Prompt

The content will be passed to the default Transformation assignment with this prompt before vectorization.

Pre Search Pompt

The search query will be passed to the default Transformation assignment with this prompt before vectorization.

Description

You may provide a description for your own use.

Once everything is configured you may save the Prompt config and return to the previous page.

Overview

Once you return to the main Prompt page you will see an overview of all your prompt configurations. You can delete any unused prompt configurations or edit configurations which will bring you back to the edit page. There are some preset configurations for recurring use cases.

Note

Ensure that a default transformation assignment exists in the AI assignment configurations.

Was this article helpful?
How can we improve this article?
Please submit the reason for your vote so that we can improve the article.
Table of Contents