Skip to main content

General Vector Search Score Boosting Strategies

Summary

Below are vector search boosting examples across the many of Qarbine’s supported databases. The example use case focuses on how to apply a 10% score boost for items with a rating of “PG13”. This is in contrast to having a rated requirement of “PG13”. The result set is then sorted by this boosted score, ensuring that personalized preferences are slightly prioritized in the recommendations.

Scenario

You provide a movie recommendation service which allows users to find movies by describing what they want in natural language. By embedding movie plots as vectors, the application can match user queries to the most relevant films, even when the input is general or open-ended.
To enhance the recommendations you want to consider user preferences such as for movies rated PG13. The approaches below ensures users see suggestions that closely match their interests while still offering a broad and personalized selection of movies. To see the impact on sample results see the detailed MongoDB and Couchbase score boosting writeups.

Boosting Strategy

In the examples below some multiply by 1.10 and others by 0.90. The choice depends on how the similarity or distance score works in the database’s vector search.

Similarity Score (Higher is Better)

Use this when your vector function returns a similarity. Examples of this are Cosine similarity and dot product. If you want to increase the score for "PG" movies, you multiply by 1.10 (add 10%). For example

CASE WHEN rating = 'PG' THEN similarity * 1.10 ELSE similarity END

Distance Score (Lower is Better)

Use this when your vector function returns a distance. Examples of this are Euclidean distance and cosine distance. If you want to make "PG" movies appear closer (better), you multiply by 0.90 (reduce by 10%). For example,

CASE WHEN rating = 'PG' THEN distance * 0.90 ELSE distance END

MongoDB's aggregation pipeline does not support boosting in the vector search stage, but you can project a boosted score in a subsequent $addFields stage:

[
{ "$search": { "index": "vector_index", "knnBeta": { "vector": "[user_prompt_embedding]", "path": "embedding", "k": 10 } } },
{ "$addFields": {
"boosted_score": {
"$cond": [
{ "$eq": [ "$rating", "PG" ] },
{ "$multiply": [ "$score", 1.10 ] },
"$score"
]
}
}
},
{ "$sort": { "boosted_score": -1 } },
{ "$limit": 10 }
]

$score represents the similarity score value from the vector search. See the separate detailed writeup on boosting MongoDB scores.

Vector Databases

Most native vector databases (like Milvus, Pinecone, Qdrant, Weaviate, LanceDB, Kinetica, ChromaDB) do not support conditional boosting in the query. You must apply boosting in logic after retrieving results.

AWS Databases

Athena

Athena is a query engine for S3 data and supports SQL, including querying vector data if stored in Parquet/CSV/JSON. But, there are no native vector search functions as of July 2025. If you implement vector search using UDFs or external libraries, you can use SQL logic for boosting.

DocumentDB

DocumentDB (MongoDB-compatible) has vector search capabilities. Like MongoDB itself, boosting is not natively supported within the vector search query. You can use an aggregation pipeline to post-process results and apply boosting.

[
{ "$vectorSearch": { "queryVector": [user_prompt_embedding], "path": "embedding", "k": 10 } },
{ "$addFields": {
"boosted_score": {
"$cond": [
{ "$eq": [ "$rating", "PG" ] },
{ "$multiply": [ "$score", 1.10 ] },
"$score"
]
}
}
},
{ "$sort": { "boosted_score": -1 } },
{ "$limit": 10 }
]

DynamoDB

Native vector search is not part of DynamoDB itself. DynamoDB is mainly a key-value and document store; for advanced search (including boosting), you must use OpenSearch as the search backend which is enabled through zero-ETL integration.

Keyspaces

AWS Keyspaces is built for Cassandra workloads and supports CQL, but vector search (such as ANN or KNN queries) is not available in Keyspaces itself. Since vector search is not available, there is no mechanism for boosting (e.g., giving "PG13" movies a 10% score bump) in queries.

Neptune

Neptune supports native vector search, allowing you to associate embeddings with graph nodes and perform similarity queries directly from the graph database. However, conditional boosting (e.g., multiplying scores for "PG" nodes) is not natively supported in the Cypher query language as of July 2025.

OpenSearch (for DynamoDB, Aurora, RDS, etc.)

OpenSearch supports vector search and boosting via script_score and function_score. You can implement conditional boosting directly in the query:

{
"query": {
"script_score": {
"query": { "match_all": {} },
"script": {
"source": """
double sim = cosineSimilarity(params.query_vector, 'embedding');
if (doc['rating'].value == 'PG') {
sim *= 1.10;
}
return sim;
""",
"params": {
"query_vector": [user_prompt_embedding]
}
}
}
}
}

Azure Databases

Overview

The primary Azure databases supported with vector search and analytics features are:

DatabaseVector Search SupportBoosting in Query?
Cosmos DB (NoSQL)Yes (native, incl. MongoDB API and NoSQL API)Partial (aggregation pipeline)
Cosmos DB for MongoDBYes (MongoDB vCore and API)Partial (aggregation pipeline)
SQL DatabaseYes (vector search in preview/GA)Yes (CASE in SQL)
Database for PostgreSQLYes (via pgvector extension)Yes (CASE in SQL)
Database for MySQLYes (recent versions)Yes (CASE in SQL)
Managed Instance for Apache CassandraYes (vector search in preview)No
SQL Managed InstanceYes (vector search in preview/GA)Yes (CASE in SQL)

Cosmos DB (MongoDB API)

Boosting (e.g., 10% for "PG" rating) must be done in the aggregation pipeline after the vector search stage. Like MongoDB, Cosmos DB does not support conditional boosting within the vector search query itself. For example

[
{ "$vectorSearch": { "queryVector": [user_prompt_embedding], "path": "embedding", "k": 10 } },
{ "$addFields": {
"boosted_score": {
"$cond": [
{ "$eq": [ "$rating", "PG" ] },
{ "$multiply": [ "$score", 1.10 ] },
"$score"
]
}
}
},
{ "$sort": { "boosted_score": -1 } },
{ "$limit": 10 }
].

Azure SQL Database / Managed Instance

Boosting is supported natively in SQL using a CASE statement. Use the 1.10 multiplier for similarity scores (higher is better) and the 0.90 for distance scores (lower is better). For example:

SELECT  id,  title,  rating,
VECTOR_COSINE_DISTANCE(embedding, '[user_prompt_embedding]') AS similarity,
CASE
WHEN rating = 'PG' THEN VECTOR_COSINE_DISTANCE(embedding, '[user_prompt_embedding]') * 0.90
ELSE VECTOR_COSINE_DISTANCE(embedding, '[user_prompt_embedding]')
END AS boosted_similarity
FROM movies
ORDER BY boosted_similarity ASC
LIMIT 10;

Microsoft SQL Server (2024+)

This database uses standard SQL CASE operations in the SELECT clause.

SELECT  id,  title,  rating,
VECTOR_COSINE_DISTANCE(embedding, '[user_prompt_embedding]') AS similarity,
CASE
WHEN rating = 'PG' THEN VECTOR_COSINE_DISTANCE(embedding, '[user_prompt_embedding]') * 0.90
ELSE VECTOR_COSINE_DISTANCE(embedding, '[user_prompt_embedding]')
END AS boosted_similarity
FROM movies
ORDER BY boosted_similarity ASC
OFFSET 0 ROWS FETCH NEXT 10 ROWS ONLY;

Similar logic applies for boosting with distance metrics.

Azure Database for PostgreSQL

As with standard PostgreSQL, use the pgvector extension and boost with a CASE statement.

Azure Database for MySQL

Recent MySQL versions support vector search; boosting is also done with a CASE statement in SQL

Azure Managed Instance for Apache Cassandra

Vector search is in preview as of July 2025 but conditional boosting is not supported in CQL..
In summary:

IBM Databases

Cloudant

IBM Cloudant supports search via Lucene-based indexes, which allow you to boost fields at index time using the boost parameter in your JavaScript index function. However, Cloudant does not support dynamic, conditional boosting based on document content (such as boosting "PG13" movies by 10%) at query time.

DataStax

DataStax does not support boosting at the query level.

DB2

This uses standard SQL CASE operations in the SELECT clause.

SELECT  id,  title,  rating,
VECTOR_DISTANCE(embedding, '[user_prompt_embedding]') AS similarity,
CASE
WHEN rating = 'PG' THEN VECTOR_DISTANCE(embedding, '[user_prompt_embedding]') * 0.90
ELSE VECTOR_DISTANCE(embedding, '[user_prompt_embedding]')
END AS boosted_similarity
FROM movies
ORDER BY boosted_similarity ASC
FETCH FIRST 10 ROWS ONLY;

Graph Databases

FalkorDB

FalkorDB supports both graph queries (via Cypher-like syntax) and vector search, including hybrid queries that combine graph traversals with vector similarity. However, conditional boosting (e.g., a 10% score bump for "PG13" movies) is not natively supported within the vector search query itself. You can filter results by metadata (e.g., only return nodes with rating = 'PG13'), but you cannot multiply the similarity score based on a property in the query.
Boosting must be done after retrieval.

TigerGraph

TigerGraph supports vector search and allows you to store and query embeddings alongside graph data.However, TigerGraph’s GSQL does not support in-query conditional boosting of similarity scores based on node properties. You can filter nodes by metadata (e.g., WHERE rating = "PG"), but boosting must be done after retrieval.

Neo4j

Neo4j supports vector search natively (as of Neo4j 5.x and Aura), using the vector.similarity() function in Cypher queries. However, Cypher does not support conditional boosting within the query itself.

Other Databases

Couchbase (SQL++)

Couchbase SQL++ supports vector functions and CASE logic.

SELECT id, title, rating,
VECTOR_SIMILARITY(embedding, '[user_prompt_embedding]') AS similarity,
CASE
WHEN rating = "PG" THEN VECTOR_SIMILARITY(embedding, '[user_prompt_embedding]') * 1.10
ELSE VECTOR_SIMILARITY(embedding, '[user_prompt_embedding]')
END AS boosted_similarity
FROM movies
ORDER BY boosted_similarity DESC
LIMIT 10;

See the separate detailed writeup on boosting Couchbase scores.

ScyllaDB

As of July 2025 ScyllaDB has not released a GA version of vector search.

SingleStore

SingleStore supports vector search and allows you to use SQL expressions for boosting.

SELECT  id,  title,  rating,
DOT_PRODUCT(embedding, '[user_prompt_embedding]') AS similarity,
CASE
WHEN rating = 'PG' THEN DOT_PRODUCT(embedding, '[user_prompt_embedding]') * 1.10
ELSE DOT_PRODUCT(embedding, '[user_prompt_embedding]')
END AS boosted_similarity
FROM movies
ORDER BY boosted_similarity DESC
LIMIT 10;

This query uses DOT_PRODUCT for similarity and boosts "PG" ratings in the same way.

SQL Databases

CockroachDB

CockroachDB uses the <=> operator for vector distance; boosting by multiplying by 0.90.

SELECT  id,  title,  rating,
embedding <=> '[user_prompt_embedding]' AS similarity,
CASE
WHEN rating = 'PG' THEN (embedding <=> '[user_prompt_embedding]') * 0.90
ELSE (embedding <=> '[user_prompt_embedding]')
END AS boosted_similarity
FROM movies
ORDER BY boosted_similarity ASC
LIMIT 10;

It is likely preferred that the baseline score not be part of the query to avoid redundantly calculated it.

Oracle (23c+)

SELECT  id,  title,  rating,
VECTOR_DISTANCE(embedding, '[user_prompt_embedding]') AS similarity,
CASE
WHEN rating = 'PG' THEN VECTOR_DISTANCE(embedding, '[user_prompt_embedding]') * 0.90
ELSE VECTOR_DISTANCE(embedding, '[user_prompt_embedding]')
END AS boosted_similarity
FROM movies
ORDER BY boosted_similarity ASC
FETCH FIRST 10 ROWS ONLY;

For distance metrics, multiply by 0.90 to boost "PG" items (lower distance is better).

PostgreSQL (pgvector)

Native boosting within a SQL query is not directly supported—you cannot dynamically multiply or adjust the similarity score based on a metadata field (like "rating") in a single SQL statement. However, you can approximate boosting by using a computed column in your SELECT and then sorting by it.

SELECT  id,  title,  rating,
1 - (embedding <=> '[user_prompt_embedding]') AS similarity,
CASE
WHEN rating = 'PG' THEN (1 - (embedding <=> '[user_prompt_embedding]')) * 1.10
ELSE (1 - (embedding <=> '[user_prompt_embedding]'))
END AS boosted_similarity
FROM movies
ORDER BY boosted_similarity DESC
LIMIT 10;

Uses the CASE statement to boost the similarity score for "PG" movies. The CASE statement applies a 10% boost to "PG" movies. Here, the cosine similarity is calculated as 1 - (embedding <=> '[user_prompt_embedding]'). Results are ordered by the boosted score.

Note: This approach works for small-to-medium datasets. For large datasets, you may need to filter first (e.g., top 100 by raw similarity) and then apply boosting in a subquery for performance reasons.

Presto / Starburst

Presto/Trino support vector search via plugins or with SQL extensions. Boosting can be done in a similar manner using a CASE statement in the SELECT and ORDER BY clauses.

SELECT  id,  title,  rating,
cosine_similarity(embedding, '[user_prompt_embedding]') AS similarity,
CASE
WHEN rating = 'PG' THEN cosine_similarity(embedding, '[user_prompt_embedding]') * 1.10
ELSE cosine_similarity(embedding, '[user_prompt_embedding]')
END AS boosted_similarity
FROM movies
ORDER BY boosted_similarity DESC
LIMIT 10;

Replace cosine_similarity with the actual function provided by your Presto/Trino vector extension.

StarTree

StarTree is built on Apache Pinot and supports vector search natively. As of July 2025, Pinot/StarTree allows vector search queries and filtering by metadata, but does not natively support conditional boosting within the query (such as multiplying the score for "PG13" movies by 1.10). Any boosting must be done after the retrieval.

Summary

For SQL-based databases (PostgreSQL, SingleStore, Presto, Oracle, SQL Server, DB2, CockroachDB, Couchbase), you can use a CASE statement in your SELECT and ORDER BY clauses to apply boosting based on metadata. For most native vector databases and graph databases, boosting must be handled after retrieval in your application code, as their query languages do not support conditional logic within the search query. MongoDB query language (MQL) databases provide aggregation pipeline techniques to boost the score.