Hybrid indexes combine both sparse and dense encodings to produce more accurate results. The dense encoder allows us to search based on semantic meaning, while the sparse encoder allows us to search based on text matches. Merging both dense and sparse into a single hybrid retrieval step allows us to step up our performance beyond what dense-only or sparse-only could achieve.
We start by installing semantic-router. Support for the new AurelioSparseEncoder
parameter was added in semantic-router==0.1.0
.
!pip install -qU "semantic-router[pinecone]==0.1.0"
We start by defining a dictionary mapping routes to example phrases that should trigger those routes.
from semantic_router import Route
politics = Route(
name="politics",
utterances=[
"isn't politics the best thing ever",
"why don't you tell me about your political opinions",
"don't you just love the president",
"don't you just hate the president",
"they're going to destroy this country!",
"they will save the country!",
],
)
/Users/jamesbriggs/Library/Caches/pypoetry/virtualenvs/semantic-router-C1zr4a78-py3.12/lib/python3.12/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html from .autonotebook import tqdm as notebook_tqdm
Let's define another for good measure:
chitchat = Route(
name="chitchat",
utterances=[
"how's the weather today?",
"how are things going?",
"lovely weather today",
"the weather is horrendous",
"let's go to the chippy",
],
)
routes = [politics, chitchat]
Now we initialize our embedding models. We are going to use a hybrid index which requires both a dense and sparse encoder. For the sparse encoder we will use the pretrained bm25
model from the Aurelio Platform and OpenAI's text-embedding-3-small
for the dense encoder.
To get an API key for the Aurelio Platform, we head to the Aurelio Platform.
import os
from getpass import getpass
from semantic_router.encoders.aurelio import AurelioSparseEncoder
os.environ["AURELIO_API_KEY"] = os.getenv("AURELIO_API_KEY") or getpass(
"Enter Aurelio API Key: "
)
sparse_encoder = AurelioSparseEncoder(name="bm25")
Sparse encoders return dictionaries containing the the indices and values of the non-zero elements in the sparse matrix.
from semantic_router.encoders import OpenAIEncoder
os.environ["OPENAI_API_KEY"] = os.getenv("OPENAI_API_KEY") or getpass(
"Enter OpenAI API Key: "
)
encoder = OpenAIEncoder(name="text-embedding-3-small", score_threshold=0.3)
We now have both our sparse and dense encoders. When using both sparse and dense encoders we need to initialize an index that supports hybrid, such as the HybridLocalIndex
or PineconeIndex
.
from semantic_router.index import PineconeIndex
os.environ["PINECONE_API_KEY"] = os.getenv("PINECONE_API_KEY") or getpass(
"Enter Pinecone API Key: "
)
index = PineconeIndex(
index_name="hybrid-test",
dimensions=1536,
metric="dotproduct",
)
2024-11-27 15:41:32 - pinecone_plugin_interface.logging - INFO - discover_namespace_packages.py:12 - discover_subpackages() - Discovering subpackages in _NamespacePath(['/Users/jamesbriggs/Library/Caches/pypoetry/virtualenvs/semantic-router-C1zr4a78-py3.12/lib/python3.12/site-packages/pinecone_plugins']) 2024-11-27 15:41:32 - pinecone_plugin_interface.logging - INFO - discover_plugins.py:9 - discover_plugins() - Looking for plugins in pinecone_plugins.inference 2024-11-27 15:41:32 - pinecone_plugin_interface.logging - INFO - installation.py:10 - install_plugins() - Installing plugin inference into Pinecone
Now we define the HybridRouter
. When called, the router will consume text (a query) and output the category (Route
) it belongs to — to initialize a HybridRouter
we need an encoder
, sparse_encoder
our routes
, and the hybrid index
we just define.
from semantic_router.routers import HybridRouter
router = HybridRouter(
encoder=encoder,
sparse_encoder=sparse_encoder,
routes=routes,
index=index,
)
Let's see if our local and remote instances are synchronized...
router.is_synced()
False
It seems like our router
is not synchronized, meaning there are differences between the utterances in our local HybridRouter
and the remote PineconeIndex
. We can view the differences by calling get_utterance_diff()
:
router.get_utterance_diff()
[' chitchat: how are things going?', " chitchat: how's the weather today?", " chitchat: let's go to the chippy", ' chitchat: lovely weather today', ' chitchat: the weather is horrendous', " politics: don't you just hate the president", " politics: don't you just love the president", " politics: isn't politics the best thing ever", ' politics: they will save the country!', " politics: they're going to destroy this country!", " politics: why don't you tell me about your political opinions"]
From this, we can see that every utterance is preceeded by a -
meaning it is unique to the local HybridRouter
. So it seems our PineconeIndex
is missing all utterances. We can confirm this further by calling router.index.get_utterances()
to see all utterances in the remote PineconeIndex
:
router.index.get_utterances()
[Utterance(route='chitchat', utterance='how are things going?', function_schemas=None, metadata={}, diff_tag=' '), Utterance(route='chitchat', utterance="how's the weather today?", function_schemas=None, metadata={}, diff_tag=' '), Utterance(route='chitchat', utterance='the weather is horrendous', function_schemas=None, metadata={}, diff_tag=' '), Utterance(route='chitchat', utterance='lovely weather today', function_schemas=None, metadata={}, diff_tag=' '), Utterance(route='chitchat', utterance="let's go to the chippy", function_schemas=None, metadata={}, diff_tag=' '), Utterance(route='politics', utterance="don't you just hate the president", function_schemas=None, metadata={}, diff_tag=' '), Utterance(route='politics', utterance="don't you just love the president", function_schemas=None, metadata={}, diff_tag=' '), Utterance(route='politics', utterance="they're going to destroy this country!", function_schemas=None, metadata={}, diff_tag=' '), Utterance(route='politics', utterance='they will save the country!', function_schemas=None, metadata={}, diff_tag=' '), Utterance(route='politics', utterance="isn't politics the best thing ever", function_schemas=None, metadata={}, diff_tag=' '), Utterance(route='politics', utterance="why don't you tell me about your political opinions", function_schemas=None, metadata={}, diff_tag=' ')]
As expected, we have no utterances in the remote PineconeIndex
. The reason for this is that when initializing our HybridRouter
we did not specify an auto_sync
parameter, so auto_sync
defaulted to None
. When auto_sync=None
no synchronization is performed during initialization. Let's try again with auto_sync="local"
, meaning take what we have locally and overwrite the remote PineconeIndex
with these local values.
router = HybridRouter(
encoder=encoder,
sparse_encoder=sparse_encoder,
routes=routes,
index=index,
auto_sync="local",
)
Now let's check our sync state:
router.is_synced()
True
router.get_utterance_diff()
[' chitchat: how are things going?', " chitchat: how's the weather today?", " chitchat: let's go to the chippy", ' chitchat: lovely weather today', ' chitchat: the weather is horrendous', " politics: don't you just hate the president", " politics: don't you just love the president", " politics: isn't politics the best thing ever", ' politics: they will save the country!', " politics: they're going to destroy this country!", " politics: why don't you tell me about your political opinions"]
... NEED TO FINISH HERE
router("it's raining cats and dogs today")
2024-11-27 15:42:03 - httpx - INFO - _client.py:1013 - _send_single_request() - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
RouteChoice(name=None, function_call=None, similarity_score=None)
router("I'm interested in learning about llama 2")
2024-11-27 15:42:06 - httpx - INFO - _client.py:1013 - _send_single_request() - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
RouteChoice(name=None, function_call=None, similarity_score=None)
In this case, we return None
because no matches were identified. We always recommend optimizing your RouteLayer
for optimal performance, you can see how in this notebook.