vector_store
VectorStore(config)
¶
Bases: ABC
Abstract base class for a vector store.
Source code in langroid/vector_store/base.py
clear_empty_collections()
abstractmethod
¶
Clear all empty collections in the vector store. Returns the number of collections deleted.
clear_all_collections(really=False, prefix='')
abstractmethod
¶
Clear all collections in the vector store.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
really |
bool
|
Whether to really clear all collections. Defaults to False. |
False
|
prefix |
str
|
Prefix of collections to clear. |
''
|
Returns: int: Number of collections deleted.
Source code in langroid/vector_store/base.py
list_collections(empty=False)
abstractmethod
¶
List all collections in the vector store (only non empty collections if empty=False).
set_collection(collection_name, replace=False)
¶
Set the current collection to the given collection name. Args: collection_name (str): Name of the collection. replace (bool, optional): Whether to replace the collection if it already exists. Defaults to False.
Source code in langroid/vector_store/base.py
create_collection(collection_name, replace=False)
abstractmethod
¶
Create a collection with the given name. Args: collection_name (str): Name of the collection. replace (bool, optional): Whether to replace the collection if it already exists. Defaults to False.
Source code in langroid/vector_store/base.py
compute_from_docs(docs, calc)
¶
Compute a result on a set of documents,
using a dataframe calc string like df.groupby('state')['income'].mean()
.
Source code in langroid/vector_store/base.py
maybe_add_ids(documents)
¶
Add ids to metadata if absent, since some vecdbs don't like having blank ids.
Source code in langroid/vector_store/base.py
similar_texts_with_scores(text, k=1, where=None)
abstractmethod
¶
Find k most similar texts to the given text, in terms of vector distance metric (e.g., cosine similarity).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
text |
str
|
The text to find similar texts for. |
required |
k |
int
|
Number of similar texts to retrieve. Defaults to 1. |
1
|
where |
Optional[str]
|
Where clause to filter the search. |
None
|
Returns:
Type | Description |
---|---|
List[Tuple[Document, float]]
|
List[Tuple[Document,float]]: List of (Document, score) tuples. |
Source code in langroid/vector_store/base.py
add_context_window(docs_scores, neighbors=0)
¶
In each doc's metadata, there may be a window_ids field indicating the ids of the chunks around the current chunk. These window_ids may overlap, so we - coalesce each overlapping groups into a single window (maintaining ordering), - create a new document for each part, preserving metadata,
We may have stored a longer set of window_ids than we need during chunking.
Now, we just want neighbors
on each side of the center of the window_ids list.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
docs_scores |
List[Tuple[Document, float]]
|
List of pairs of documents to add context windows to together with their match scores. |
required |
neighbors |
int
|
Number of neighbors on "each side" of match to retrieve. Defaults to 0. "Each side" here means before and after the match, in the original text. |
0
|
Returns:
Type | Description |
---|---|
List[Tuple[Document, float]]
|
List[Tuple[Document, float]]: List of (Document, score) tuples. |
Source code in langroid/vector_store/base.py
remove_overlaps(windows)
staticmethod
¶
Given a collection of windows, where each window is a sequence of ids, identify groups of overlapping windows, and for each overlapping group, order the chunk-ids using topological sort so they appear in the original order in the text.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
windows |
List[int | str]
|
List of windows, where each window is a sequence of ids. |
required |
Returns:
Type | Description |
---|---|
List[List[str]]
|
List[int|str]: List of windows, where each window is a sequence of ids, and no two windows overlap. |
Source code in langroid/vector_store/base.py
get_all_documents(where='')
abstractmethod
¶
Get all documents in the current collection, possibly filtered by where
.
QdrantDB(config=QdrantDBConfig())
¶
Bases: VectorStore
Source code in langroid/vector_store/qdrantdb.py
77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 |
|
clear_all_collections(really=False, prefix='')
¶
Clear all collections with the given prefix.
Source code in langroid/vector_store/qdrantdb.py
list_collections(empty=False)
¶
Returns:
Type | Description |
---|---|
List[str]
|
List of collection names that have at least one vector. |
Parameters:
Name | Type | Description | Default |
---|---|---|---|
empty |
bool
|
Whether to include empty collections. |
False
|
Source code in langroid/vector_store/qdrantdb.py
create_collection(collection_name, replace=False)
¶
Create a collection with the given name, optionally replacing an existing
collection if replace
is True.
Args:
collection_name (str): Name of the collection to create.
replace (bool): Whether to replace an existing collection
with the same name. Defaults to False.