Semantic router can be used with many hundreds, thousands, or even more routes. At very large scales it can be useful to use a vector database to store and search though your route vector space. Although we do not demonstrate very large scale in this notebook, we will demonstrate more routes than usual and we will also see how to use the PineconeIndex
for potential scalability and route persistence beyond our local machines.
!pip install -qU \
"semantic-router[local, pinecone]==0.0.22" \
datasets==2.17.0
WARNING: Ignoring invalid distribution ~illow (C:\Users\Siraj\Documents\Personal\Work\Aurelio\Virtual Environments\semantic_router_3\Lib\site-packages) WARNING: Ignoring invalid distribution ~rotobuf (C:\Users\Siraj\Documents\Personal\Work\Aurelio\Virtual Environments\semantic_router_3\Lib\site-packages) WARNING: Ignoring invalid distribution ~illow (C:\Users\Siraj\Documents\Personal\Work\Aurelio\Virtual Environments\semantic_router_3\Lib\site-packages) WARNING: Ignoring invalid distribution ~rotobuf (C:\Users\Siraj\Documents\Personal\Work\Aurelio\Virtual Environments\semantic_router_3\Lib\site-packages) WARNING: Ignoring invalid distribution ~illow (C:\Users\Siraj\Documents\Personal\Work\Aurelio\Virtual Environments\semantic_router_3\Lib\site-packages) WARNING: Ignoring invalid distribution ~rotobuf (C:\Users\Siraj\Documents\Personal\Work\Aurelio\Virtual Environments\semantic_router_3\Lib\site-packages) WARNING: Ignoring invalid distribution ~illow (C:\Users\Siraj\Documents\Personal\Work\Aurelio\Virtual Environments\semantic_router_3\Lib\site-packages) WARNING: Ignoring invalid distribution ~rotobuf (C:\Users\Siraj\Documents\Personal\Work\Aurelio\Virtual Environments\semantic_router_3\Lib\site-packages) [notice] A new release of pip is available: 23.1.2 -> 24.0 [notice] To update, run: python.exe -m pip install --upgrade pip
from datasets import load_dataset
data = load_dataset("aurelio-ai/generic-routes", split="train")
data
c:\Users\Siraj\Documents\Personal\Work\Aurelio\Virtual Environments\semantic_router_3\Lib\site-packages\tqdm\auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html from .autonotebook import tqdm as notebook_tqdm
Dataset({ features: ['name', 'utterances', 'description', 'function_schema', 'llm', 'score_threshold'], num_rows: 50 })
Each row in this dataset is a single route:
data[0]
{'name': 'politics', 'utterances': ["isn't politics the best thing ever", "why don't you tell me about your political opinions", "don't you just love the presidentdon't you just hate the president", "they're going to destroy this country!", 'they will save the country!'], 'description': None, 'function_schema': None, 'llm': None, 'score_threshold': 0.82}
We transform these into Route
objects like so:
from semantic_router import Route
routes = [Route(**data[i]) for i in range(len(data))]
routes[0]
Route(name='politics', utterances=["isn't politics the best thing ever", "why don't you tell me about your political opinions", "don't you just love the presidentdon't you just hate the president", "they're going to destroy this country!", 'they will save the country!'], description=None, function_schema=None, llm=None, score_threshold=0.82)
Next we initialize an encoder
. We will use a simple HuggingFaceEncoder
, we can also use popular encoder APIs like CohereEncoder
and OpenAIEncoder
.
from semantic_router.encoders import HuggingFaceEncoder
encoder = HuggingFaceEncoder()
c:\Users\Siraj\Documents\Personal\Work\Aurelio\Virtual Environments\semantic_router_3\Lib\site-packages\huggingface_hub\file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`. warnings.warn(
Now we initialize our PineconeIndex
, all it requires is a Pinecone API key (you do need to be using Pinecone Serverless).
import os
from getpass import getpass
from semantic_router.index.pinecone import PineconeIndex
os.environ["PINECONE_API_KEY"] = os.environ.get("PINECONE_API_KEY") or getpass(
"Enter Pinecone API key: "
)
index = PineconeIndex(index_name="index", namespace="namespace")
2024-05-07 21:33:48 WARNING semantic_router.utils.logger Index could not be initialized.
from semantic_router import RouteLayer
rl = RouteLayer(encoder=encoder, routes=routes, index=index)
2024-05-07 21:33:48 INFO semantic_router.utils.logger local
We run the solely static routes layer:
rl("how's the weather today?").name
2024-05-07 21:33:57 WARNING semantic_router.utils.logger No classification found for semantic classifier. 2024-05-07 21:33:57 ERROR semantic_router.utils.logger No route found with name . Check to see if any Routes have been defined.
If you see a warning about no classification being found, wait a moment and run the above cell again.
Because we're using Pinecone our route index can now persist / be access from different locations by simply connecting to the pre-existing index, by default this index uses the identifier "semantic-router--index"
— this is the index we'll be loading here, but we can change the name via the index_name
parameter if preferred.
First, let's delete our old route layer, index
, and routes
.
del rl, index, routes
Let's load our index first. As mentioned, "index"
is the default index name and we are passing "namespace"
as namespace name for the pinecone object.
index = PineconeIndex(index_name="index", namespace="namespace")
We load the pre-existing routes from this index like so:
index.get_routes()
[('fitness_tips', 'suggest a workout routine'), ('career_advice', 'what are the emerging career fields?'), ('mental_health_support', 'how can I manage stress?'), ('gardening_and_horticulture', 'suggest some easy-care indoor plants'), ('fitness_tips', 'give me a fitness tip'), ('debugging_tips', 'tips for debugging asynchronous code'), ('historical_events', 'share an interesting piece of medieval history'), ('chitchat', 'the weather is horrendous'), ('art_and_culture', 'suggest some must-visit museums'), ('daily_inspiration', 'I need some inspiration for today'), ('language_learning', 'suggest ways to learn a new language'), ('creative_writing_and_literature', 'how can I improve my writing skills?'), ('cloud_computing', 'AWS vs Azure vs Google Cloud'), ('language_learning', 'how can I improve my Spanish?'), ('career_advice_in_tech', 'navigating career growth in tech'), ('creative_writing_and_literature', 'what are some tips for storytelling?'), ('data_structures_and_algorithms', 'algorithms every developer should know'), ('historical_events', 'tell me about a significant historical event'), ('gaming_and_esports', 'suggest a good game for beginners'), ('art_and_culture', "what's an interesting cultural tradition?"), ('environmental_awareness', 'tell me about sustainability practices'), ('food_and_recipes', "what's your favorite food?"), ('jokes', 'tell me a joke'), ('frameworks_and_libraries', 'best Python libraries for data analysis'), ('cloud_computing', 'best practices for cloud security'), ('development_tools', 'using Docker in development'), ('frameworks_and_libraries', "what's the difference between React and Angular?"), ('gardening_and_horticulture', 'how do I start a vegetable garden?'), ('language_syntax', 'what are the new features in Java 15?'), ('coding_standards_and_conventions', 'JavaScript coding conventions'), ('machine_learning_in_development', 'how to start with machine learning in Python'), ('frameworks_and_libraries', 'introduction to Django for web development'), ('jokes', 'make me laugh'), ('historical_events', 'who was a notable figure in ancient history?'), ('daily_inspiration', 'give me an inspirational quote'), ('compliments', 'I need some positive vibes'), ('jokes', 'know any good jokes?'), ('ethical_considerations_in_tech', 'the role of ethics in artificial intelligence'), ('gaming_and_esports', 'tell me about upcoming esports events'), ('food_and_recipes', 'suggest a recipe for dinner'), ('best_practices', 'how to write clean code in Python'), ('data_structures_and_algorithms', 'complexity analysis of algorithms'), ('interview_preparation', 'common programming interview questions'), ('book_recommendations', "what's your favorite book?"), ('gaming_and_esports', 'what are the popular games right now?'), ('debugging_tips', 'how do I debug segmentation faults in C++?'), ('creative_writing_and_literature', 'suggest some classic literature'), ('art_and_culture', 'tell me about your favorite artist'), ('career_advice', 'suggest some career development tips'), ('language_learning', 'what are some effective language learning techniques?'), ('astronomy_and_space_exploration', 'what are some interesting facts about the universe?'), ('educational_facts', 'tell me an interesting fact'), ('chitchat', "let's go to the chippy"), ('astronomy_and_space_exploration', 'tell me about the latest space mission'), ('book_recommendations', 'I need a book recommendation'), ('data_structures_and_algorithms', 'basic data structures for beginners'), ('interview_preparation', 'how to prepare for a coding interview'), ('career_advice_in_tech', 'tips for landing your first tech job'), ('compliments', 'say something nice about me'), ('cybersecurity_best_practices', 'securing your web applications'), ('chitchat', "how's the weather today?"), ('ethical_considerations_in_tech', 'ethical hacking and its importance'), ('mindfulness_and_wellness', 'tell me about mindfulness'), ('debugging_tips', 'best tools for JavaScript debugging'), ('best_practices', 'what are the best practices for REST API design?'), ('hobbies_and_interests', 'suggest me a hobby'), ('cybersecurity_best_practices', 'common security vulnerabilities to avoid'), ('language_syntax', 'how do closures work in JavaScript?'), ('hobbies_and_interests', "I'm looking for a new pastime"), ('machine_learning_in_development', 'using TensorFlow for beginners'), ('mindfulness_and_wellness', 'how can I relax?'), ('gardening_and_horticulture', 'what are some tips for sustainable gardening?'), ('career_advice', 'how can I improve my resume?'), ('development_tools', 'best Git clients for macOS'), ('food_and_recipes', 'tell me about a dish from your country'), ('development_tools', 'recommendations for Python IDEs'), ('environmental_awareness', 'how can I be more eco-friendly?'), ('ethical_considerations_in_tech', 'privacy concerns in app development'), ('environmental_awareness', 'what are some ways to save the planet?'), ('machine_learning_in_development', 'machine learning model deployment best practices'), ('chitchat', 'lovely weather today'), ('language_syntax', 'explain the syntax of Python functions'), ('educational_facts', 'share a science fact'), ('best_practices', 'best practices for error handling in JavaScript'), ('cloud_computing', 'introduction to cloud storage options'), ('compliments', 'give me a compliment'), ('career_advice_in_tech', 'how to build a portfolio for software development'), ('book_recommendations', 'suggest a good book to read'), ('daily_inspiration', 'share something uplifting'), ('mental_health_support', 'share some self-care practices'), ('coding_standards_and_conventions', 'why coding standards matter'), ('chitchat', 'how are things going?'), ('cybersecurity_best_practices', 'introduction to ethical hacking for developers'), ('astronomy_and_space_exploration', 'how can I stargaze effectively?'), ('fitness_tips', 'how can I stay active at home?'), ('mental_health_support', 'what are ways to improve mental health?'), ('interview_preparation', 'tips for technical interviews'), ('hobbies_and_interests', 'what are your interests?'), ('coding_standards_and_conventions', 'maintaining consistency in codebase'), ('educational_facts', 'do you know any historical trivia?'), ('programming_challenges', 'programming tasks to improve problem-solving'), ('web_development_trends', 'the future of web development'), ('mobile_app_development', 'optimizing performance in mobile apps'), ('project_management_in_tech', 'agile vs waterfall project management'), ('motivation', 'I need some motivation'), ('version_control_systems', 'how to revert a commit in Git'), ('project_management_in_tech', 'tools for managing tech projects'), ('pet_care_advice', 'how can I train my dog?'), ('version_control_systems', 'best practices for branching in Git'), ('version_control_systems', 'introduction to SVN for beginners'), ('movie_suggestions', 'recommend a movie'), ('travel_stories', 'tell me about your favorite travel destination'), ('software_architecture', 'differences between MVC and MVVM'), ('movie_suggestions', "what's your favorite movie?"), ('pet_care_advice', 'suggest some tips for cat care'), ('web_development_trends', 'emerging back-end technologies'), ('movie_suggestions', 'suggest a good movie for tonight'), ('tech_trends', 'tell me about the latest gadgets'), ('philosophical_questions', 'do you believe in fate?'), ('philosophical_questions', 'what are your thoughts on free will?'), ('programming_challenges', 'where can I find algorithmic puzzles?'), ('software_architecture', 'explain microservices architecture'), ('politics', "don't you just love the presidentdon't you just hate the president"), ('motivation', 'give me a motivational quote'), ('personal_questions', 'what do you like to do for fun?'), ('open_source_contributions', 'finding projects that accept contributions'), ('science_and_innovation', 'how does AI impact our daily lives?'), ('tech_trends', "what's new in technology?"), ('software_architecture', 'introduction to domain-driven design'), ('motivation', 'inspire me'), ('programming_challenges', 'suggest a coding challenge for beginners'), ('science_and_innovation', 'tell me about a recent innovation'), ('open_source_contributions', 'best practices for open-source contributors'), ('mindfulness_and_wellness', 'give me a wellness tip'), ('mobile_app_development', 'Kotlin vs Swift for mobile development'), ('personal_questions', "what's your favorite color?"), ('pet_care_advice', 'what should I know about keeping a pet rabbit?'), ('personal_questions', 'do you have any hobbies?'), ('music_discovery', 'suggest some new music'), ('politics', 'they will save the country!'), ('tech_trends', 'what are the emerging tech trends?'), ('travel_stories', "what's the most interesting place you've visited?"), ('project_management_in_tech', 'how to lead a development team'), ('music_discovery', 'who are the top artists right now?'), ('music_discovery', 'recommend songs for a workout playlist'), ('open_source_contributions', 'how to start contributing to open source'), ('philosophical_questions', 'what is the meaning of life?'), ('politics', "why don't you tell me about your political opinions"), ('science_and_innovation', 'what are the latest scientific discoveries?'), ('politics', "isn't politics the best thing ever"), ('politics', "they're going to destroy this country!"), ('travel_stories', 'share a travel story'), ('mobile_app_development', 'best tools for cross-platform mobile development'), ('web_development_trends', "what's new in front-end development?")]
We will transform these into a dictionary format that we can use to initialize our Route
objects.
routes_dict = {}
for route, utterance in index.get_routes():
if route not in routes_dict:
routes_dict[route] = []
routes_dict[route].append(utterance)
routes_dict
{'book_recommendations': ['suggest a good book to read', 'I need a book recommendation', "what's your favorite book?"], 'chitchat': ['lovely weather today', "how's the weather today?", 'the weather is horrendous', "let's go to the chippy", 'how are things going?'], 'jokes': ['make me laugh', 'tell me a joke', 'know any good jokes?'], 'frameworks_and_libraries': ["what's the difference between React and Angular?", 'introduction to Django for web development', 'best Python libraries for data analysis'], 'language_learning': ['what are some effective language learning techniques?', 'suggest ways to learn a new language', 'how can I improve my Spanish?'], 'fitness_tips': ['how can I stay active at home?', 'suggest a workout routine', 'give me a fitness tip'], 'debugging_tips': ['tips for debugging asynchronous code', 'how do I debug segmentation faults in C++?', 'best tools for JavaScript debugging'], 'interview_preparation': ['how to prepare for a coding interview', 'tips for technical interviews', 'common programming interview questions'], 'cybersecurity_best_practices': ['securing your web applications', 'introduction to ethical hacking for developers', 'common security vulnerabilities to avoid'], 'astronomy_and_space_exploration': ['what are some interesting facts about the universe?', 'tell me about the latest space mission', 'how can I stargaze effectively?'], 'ethical_considerations_in_tech': ['the role of ethics in artificial intelligence', 'privacy concerns in app development', 'ethical hacking and its importance'], 'data_structures_and_algorithms': ['algorithms every developer should know', 'basic data structures for beginners', 'complexity analysis of algorithms'], 'educational_facts': ['tell me an interesting fact', 'share a science fact', 'do you know any historical trivia?'], 'career_advice_in_tech': ['how to build a portfolio for software development', 'navigating career growth in tech', 'tips for landing your first tech job'], 'food_and_recipes': ['tell me about a dish from your country', "what's your favorite food?", 'suggest a recipe for dinner'], 'daily_inspiration': ['give me an inspirational quote', 'I need some inspiration for today', 'share something uplifting'], 'development_tools': ['best Git clients for macOS', 'using Docker in development', 'recommendations for Python IDEs'], 'best_practices': ['best practices for error handling in JavaScript', 'what are the best practices for REST API design?', 'how to write clean code in Python'], 'cloud_computing': ['best practices for cloud security', 'introduction to cloud storage options', 'AWS vs Azure vs Google Cloud'], 'career_advice': ['suggest some career development tips', 'how can I improve my resume?', 'what are the emerging career fields?'], 'compliments': ['say something nice about me', 'I need some positive vibes', 'give me a compliment'], 'coding_standards_and_conventions': ['maintaining consistency in codebase', 'JavaScript coding conventions', 'why coding standards matter'], 'art_and_culture': ['suggest some must-visit museums', "what's an interesting cultural tradition?", 'tell me about your favorite artist'], 'hobbies_and_interests': ['what are your interests?', 'suggest me a hobby', "I'm looking for a new pastime"], 'machine_learning_in_development': ['using TensorFlow for beginners', 'machine learning model deployment best practices', 'how to start with machine learning in Python'], 'mindfulness_and_wellness': ['tell me about mindfulness', 'how can I relax?', 'give me a wellness tip'], 'gaming_and_esports': ['tell me about upcoming esports events', 'what are the popular games right now?', 'suggest a good game for beginners'], 'historical_events': ['tell me about a significant historical event', 'who was a notable figure in ancient history?', 'share an interesting piece of medieval history'], 'mental_health_support': ['share some self-care practices', 'how can I manage stress?', 'what are ways to improve mental health?'], 'language_syntax': ['what are the new features in Java 15?', 'how do closures work in JavaScript?', 'explain the syntax of Python functions'], 'environmental_awareness': ['tell me about sustainability practices', 'how can I be more eco-friendly?', 'what are some ways to save the planet?'], 'gardening_and_horticulture': ['suggest some easy-care indoor plants', 'how do I start a vegetable garden?', 'what are some tips for sustainable gardening?'], 'creative_writing_and_literature': ['how can I improve my writing skills?', 'what are some tips for storytelling?', 'suggest some classic literature'], 'web_development_trends': ['the future of web development', 'emerging back-end technologies', "what's new in front-end development?"], 'travel_stories': ['share a travel story', 'tell me about your favorite travel destination', "what's the most interesting place you've visited?"], 'version_control_systems': ['best practices for branching in Git', 'how to revert a commit in Git', 'introduction to SVN for beginners'], 'project_management_in_tech': ['how to lead a development team', 'agile vs waterfall project management', 'tools for managing tech projects'], 'philosophical_questions': ['what is the meaning of life?', 'do you believe in fate?', 'what are your thoughts on free will?'], 'software_architecture': ['explain microservices architecture', 'differences between MVC and MVVM', 'introduction to domain-driven design'], 'politics': ["why don't you tell me about your political opinions", "isn't politics the best thing ever", 'they will save the country!', "don't you just love the presidentdon't you just hate the president", "they're going to destroy this country!"], 'music_discovery': ['suggest some new music', 'who are the top artists right now?', 'recommend songs for a workout playlist'], 'personal_questions': ['do you have any hobbies?', "what's your favorite color?", 'what do you like to do for fun?'], 'motivation': ['inspire me', 'give me a motivational quote', 'I need some motivation'], 'mobile_app_development': ['optimizing performance in mobile apps', 'Kotlin vs Swift for mobile development', 'best tools for cross-platform mobile development'], 'movie_suggestions': ['suggest a good movie for tonight', 'recommend a movie', "what's your favorite movie?"], 'open_source_contributions': ['best practices for open-source contributors', 'how to start contributing to open source', 'finding projects that accept contributions'], 'science_and_innovation': ['how does AI impact our daily lives?', 'tell me about a recent innovation', 'what are the latest scientific discoveries?'], 'tech_trends': ["what's new in technology?", 'tell me about the latest gadgets', 'what are the emerging tech trends?'], 'programming_challenges': ['suggest a coding challenge for beginners', 'where can I find algorithmic puzzles?', 'programming tasks to improve problem-solving'], 'pet_care_advice': ['what should I know about keeping a pet rabbit?', 'suggest some tips for cat care', 'how can I train my dog?']}
Now we transform these into a list of Route
objects.
routes = [
Route(name=route, utterances=utterances)
for route, utterances in routes_dict.items()
]
routes[0]
Route(name='book_recommendations', utterances=['suggest a good book to read', 'I need a book recommendation', "what's your favorite book?"], description=None, function_schema=None, llm=None, score_threshold=None)
Now we reinitialize our RouteLayer
:
from semantic_router import RouteLayer
rl = RouteLayer(encoder=encoder, routes=routes, index=index)
2024-05-07 21:34:08 INFO semantic_router.utils.logger local
And test it again:
rl("say something to make me laugh").name
'jokes'
rl("tell me something amusing").name
'jokes'
rl("it's raining cats and dogs today").name
'chitchat'
# delete index
index.delete_index()
Perfect, our routes loaded from our PineconeIndex
are working as expected! As mentioned, we can use the PineconeIndex
for persistance and high scale use-cases, for example where we might have hundreds of thousands of utterances, or even millions.