Qdrant
Querying Overview
Qdrant’s native query interface uses a programmatic interface to retrieve data. The general use case is to identify an index with the data of interest. Next a vector of values is required to indicate a point in n-space from which to locate similar data. It is recommended that the query set at most how many items to retrieve as well.
One way to specify a Pinecone query in Qarbine is to use a JSON-like structure which looks like the following.
{
limit: 2,
vector: [...],
filter: { …}
}
The number of values in the vector argument must correspond to the same number within the index’s underlying data. Qarbine only performs data retrievals. There is purposely no delete functionality provided.
The overall general form of the syntax used by Qarbine is shown below
{
index : "test_collection", ← This is required.
nearText: "dracula", ← Set this or the vector field below.
useVector: "vectorName", ← If multiple vectors are on the collection.
useAssistant: "myAssistant", ← The Qarbine AI Assistant alias.
returnNoValues: true, ← The default is true.
query: { ← This maps to a standard Qdrant query object.
limit: 20, ← At most how many matches to return.
No more than 100 will be returned.
vector: [...], ← The optional list of n-space numbers.
with_payload: true, ← The default is true.
This value can also be a list of fields such as
["city", "village", "town"]
filter: {...}, ← The optional Qdrant filter.
with_vector: false, ← Indicates if vector values are returned.
The default is false.
…
}
}
The query field general maps to the Qdrant query specification. For more information on querying see https://qdrant.tech/documentation/concepts/search/. The optional Qdrant filtering is described at https://qdrant.tech/documentation/concepts/filtering/.
Either the outer nearText, nearVector or the inner vector field must be defined. If a nearText phrase is set then the configured Qarbine AI Assistant service is used to obtain the vector argument to use in the effective Qdrant query. This service is set up by the Qarbine administrator.
If the Qdrant collection was created with multiple vectors, the name of the vector to use for searching should be provided via the ‘useVector’ argument. If the collection was created with sparse vectors, the name of the sparse vector to use for searching should be provided.
The returnNoValues field is a boolean indicating if the data’s “values” field should be returned. Usually this is not useful for Qarbine analytic and presentation purposes and it just adds bulk to the resulting reply. The default is true to not return them.
By default the Qdrant answer set is ordered by score. If you want a different order then use a Qarbine pragma such as “#pragma runPostQuery select * from data order by XXXX”. For example, you may want the 5 closest movies and hen sort them by name.
Prerequisites
Prior to using Qarbine’s embeddings(...) macro function or the SQL-like query function nearText(...), the Qarbine Administrator must first configure “AI Assistant(s)”. The AI Assistants provide access to various popular Generative AI services and are referenced using an alias. Check with your Qarbine administrator for which ones are available and their proper use. For example, when using dynamic query vector embeddings, the model used by the AI Assistant must be compatible with the one used to generate the original embedding values in the database.
Qarbine SQL Interface
To expand the usage of the Qdrant database Qarbine also provides a convenient SQL oriented interface to retrieve Qdrant data. The WHERE criteria is translated into the equivalent Qdrant filter and sent to the Qdrant database. Qarbine allows both Qdrant’s MUST, MUST NOT, and SHOULD concepts to be used within the SQL query as well. The Qdrant payload filtering limitations above still apply.
The simple format is
Quadrant allows specifying what fields to include and exclude. The ‘include’ approach maps to standard SQL SELECT column semantics. The exclude feature is a convenient way to return everything except certain columns. This is specified by prefixing the column with a ‘!’. Below are some SELECT variations.
select * … | select city, color … | select !city … |
The base WHERE clause can take several forms:
vector = (number1, number n …) ← A different way of expressing nearVector()
nearVector(number1, number n …)
nearNamedVector(useVector, number1, number n …)
nearText(aPhrase)
nearText(aPhrase, null, useVector)
nearText(aPhrase, aiAssistantAlias, useVector)
The parameters are described below.
Parameter | Description |
---|---|
aPhrase | A quoted value for which a vector is first obtained by Qarbine and then passed along as the raw vector value. |
useVector | Used when a Qdrant collection has multiple vectors configured. If the collection was created with multiple vectors, the name of the vector to use for searching should be provided. |
aiAssistantAlias | Refers to a Qarbine AI Assistant alias as configured by the Qarbine administrator. This is important to consider so that the model used to generate the raw vector value is compatible with that used to create the stored values. |
Function | Description |
---|---|
nearVector | This clause is removed from the WHERE criteria and its list of numbers argument set into “nearVector” field of the query specification. |
nearNamedVector | This clause is removed from the WHERE criteria. The first argument is the name of the vector. It is followed by a list of numbers set into “nearVector” field of the query specification. |
nearText | This clause is removed from the WHERE criteria and its argument set into “nearText” field of the query specification. The nearText argument can be used by query.nearText(), hybrid.nearText(), or generate.nearText(). Indicate which operation is wanted in the query specification. |
withOption | Pass in the specification field name and the value to set. This clause is removed from the WHERE clause. |
withOptions | Set several specification fields at once. The format is withOptions(key1, value1, keyN, valueN).The key argument may use dot notation when setting the inner value of a component object. |
The elements of the SQL statement are extracted and used as parameters to the standard native Qdrant interface as previously discussed. The returnNoValues argument is always true. The WHERE clause may have additional criteria starting with “AND”. The default ordering is by score with the highest one first.
The SQL query below
select * from test_collection
where vector = (0.2, 0.1, 0.9, 0.7)
and city != "Moscow"
limit 25
is equivalent to
{
"index": "test_collection",
"returnNoValues": true,
"query": {
"with_vector": false,
"with_payload": true,
"vector": [ 0.2, 0.1, 0.9, 0.7 ],
"filter": {
"must_not": [ { "key": "city", "match": { "value": "Moscow" } } ]
},
"limit": 25
}
}
Qarbine data retrieval tools have a default LIMIT which can be overridden.
The SQL query interface supports equal, in, not equal, and “not in” concepts. The former pair uses “must” list and the latter a “must_not” list. For information in Qdrant querying see https://qdrant.tech/documentation/concepts/filtering/.
Here is an example with multiple AND clauses.
select * from test_collection2
where vector = (0.2, 0.1, 0.9, 0.7)
and city = 'London' and color = 'red'
Notice the SQL numeric list is enclosed in parentheses while the specification one is enclosed in brackets. That is a subtle nuance across the SQL and JSON syntax standards.
Here is an example combining equals and not equals.
select * from test_collection2
where vector = (0.2, 0.1, 0.9, 0.7)
and city = 'London' and color != 'red'
Here is an example using list and equal criteria.
select * from my_books
where nearText('invasion', 'myHuggingFace')
and year not in (2008)
and name = 'The Hunger Games'
You can use score_threshold parameter of the search query. It will exclude all results with a score worse than the given. In SQL you must use “ score >= number”. Below is an example,
select * from test_collection2
where vector = (0.2, 0.1, 0.9, 0.7)
and city = 'London' and color = 'red'
and score >= 0.7
Qarbine also supports Qdrant ‘should’ filtering as well. This is done by prefixing the value with a ‘˜’. For example this SQL
select * from test_collection2
where vector = (0.2, 0.1, 0.9, 0.7) and color = '~red'
generates filter criteria of
"filter": { "should": [ { "key": "color", "match": { "value": "red" } } ] },
Note that Qdrant does not support ‘should not’. So, “color != ˜red” is invalid.
This SQL request
select * from test_collection2
where vector = (0.2, 0.1, 0.9, 0.7)
and color = '~red' and rating >= ~5 and rating <= 7
generates the native Qdrant filter
"filter": {
"must": [ { "key": "rating", "range": { "lte": 7 } } ],
"should": [
{ "key": "color", "match": {"value": "red" } },
{ "key": "rating", "range": {"gte": 5 } }
]
},
The following SQL request demonstrates a ‘should’ filter for a list.
select * from my_books
where embeddingNear( 'invasion', 'myHuggingFace' )
and year in ~[2008]
There is no 'should not' list support for ‘NOT IN’ operations. For SQL criteria which is ‘field BETWEEN x AND y’ use ‘field >= x and field <= y’ instead. Remember that Qdrant only supports numeric range comparisons.
Reviewing the Generated Specification
You can enter criteria of the form “EXPLAIN SELECT ….” to have the SQL statement processed and have the returned answer set be the underlying query specification. When you want to see the effective Pinecone query specification simple precede your SELECT statement with the “explain ” text. A convenient way of specifying this is to have “explain” on the first line and the rest of your SQL on the next lines.
explain
select * from my_books
where nearText( 'invasion', 'myHuggingFace' )
Then simply “comment out” the first line when not in use
// explain
select * from my_books
where nearText( 'invasion', 'myHuggingFace' )
You can also use “explain: true” in the JSON query specification for similar information.
Another way to get the specification is to press ALT and click . Below is a sample result.
Any “explain SELECT” or “explain: true” takes precedence over the ALT-click interaction.
Qarbine Virtual Queries
There are a few convenience queries which are mainly DBA oriented. These queries are recognized by the Qarbine driver and provide common database information.
Query | Description |
---|---|
list databases | Return a list of databases. |
list indexes | Return a list of Pinecone collections. |
describe indexes | Provide details on all of the indexes. This may take a while depending on your database structure. |
describe index COLLECTION | Provide details on the given index. |
See the “DBA Productivity” section of the online documentation for more details.