Notebook

使用NVIDIA的LLM API目录连接器¶

本笔记本将指导您了解NVIDIA连接器的基本用法。

通过这个连接器，您将能够连接到NVIDIA的API目录中可用的兼容模型，并生成模型，例如：

Google的gemma-7b
Mistal AI的mistral-7b-instruct-v0.2
等等！

我们将首先确保安装了llama-index和相关软件包。

注意：目前，只有基本URL为https://integrate.api.nvidia.com/v1的模型与此连接器兼容。

In [ ]:

!pip install llama-index-embeddings-openai llama-index-readers-file

Collecting llama-index-embeddings-openai
  Using cached llama_index_embeddings_openai-0.1.7-py3-none-any.whl.metadata (603 bytes)
Requirement already satisfied: llama-index-core<0.11.0,>=0.10.1 in /home/chris/anaconda3/envs/nvidia-llama-index-api/lib/python3.11/site-packages (from llama-index-embeddings-openai) (0.10.30)
Requirement already satisfied: PyYAML>=6.0.1 in /home/chris/anaconda3/envs/nvidia-llama-index-api/lib/python3.11/site-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-embeddings-openai) (6.0.1)
Requirement already satisfied: SQLAlchemy>=1.4.49 in /home/chris/anaconda3/envs/nvidia-llama-index-api/lib/python3.11/site-packages (from SQLAlchemy[asyncio]>=1.4.49->llama-index-core<0.11.0,>=0.10.1->llama-index-embeddings-openai) (2.0.29)
Requirement already satisfied: aiohttp<4.0.0,>=3.8.6 in /home/chris/anaconda3/envs/nvidia-llama-index-api/lib/python3.11/site-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-embeddings-openai) (3.9.5)
Requirement already satisfied: dataclasses-json in /home/chris/anaconda3/envs/nvidia-llama-index-api/lib/python3.11/site-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-embeddings-openai) (0.6.4)
Requirement already satisfied: deprecated>=1.2.9.3 in /home/chris/anaconda3/envs/nvidia-llama-index-api/lib/python3.11/site-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-embeddings-openai) (1.2.14)
Requirement already satisfied: dirtyjson<2.0.0,>=1.0.8 in /home/chris/anaconda3/envs/nvidia-llama-index-api/lib/python3.11/site-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-embeddings-openai) (1.0.8)
Requirement already satisfied: fsspec>=2023.5.0 in /home/chris/anaconda3/envs/nvidia-llama-index-api/lib/python3.11/site-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-embeddings-openai) (2024.3.1)
Requirement already satisfied: httpx in /home/chris/anaconda3/envs/nvidia-llama-index-api/lib/python3.11/site-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-embeddings-openai) (0.27.0)
Requirement already satisfied: llamaindex-py-client<0.2.0,>=0.1.18 in /home/chris/anaconda3/envs/nvidia-llama-index-api/lib/python3.11/site-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-embeddings-openai) (0.1.18)
Requirement already satisfied: nest-asyncio<2.0.0,>=1.5.8 in /home/chris/anaconda3/envs/nvidia-llama-index-api/lib/python3.11/site-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-embeddings-openai) (1.6.0)
Requirement already satisfied: networkx>=3.0 in /home/chris/anaconda3/envs/nvidia-llama-index-api/lib/python3.11/site-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-embeddings-openai) (3.1)
Requirement already satisfied: nltk<4.0.0,>=3.8.1 in /home/chris/anaconda3/envs/nvidia-llama-index-api/lib/python3.11/site-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-embeddings-openai) (3.8.1)
Requirement already satisfied: numpy in /home/chris/anaconda3/envs/nvidia-llama-index-api/lib/python3.11/site-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-embeddings-openai) (1.24.4)
Requirement already satisfied: openai>=1.1.0 in /home/chris/anaconda3/envs/nvidia-llama-index-api/lib/python3.11/site-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-embeddings-openai) (1.22.0)
Requirement already satisfied: pandas in /home/chris/anaconda3/envs/nvidia-llama-index-api/lib/python3.11/site-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-embeddings-openai) (2.0.3)
Requirement already satisfied: pillow>=9.0.0 in /home/chris/anaconda3/envs/nvidia-llama-index-api/lib/python3.11/site-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-embeddings-openai) (10.3.0)
Requirement already satisfied: requests>=2.31.0 in /home/chris/anaconda3/envs/nvidia-llama-index-api/lib/python3.11/site-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-embeddings-openai) (2.31.0)
Requirement already satisfied: tenacity<9.0.0,>=8.2.0 in /home/chris/anaconda3/envs/nvidia-llama-index-api/lib/python3.11/site-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-embeddings-openai) (8.2.3)
Requirement already satisfied: tiktoken>=0.3.3 in /home/chris/anaconda3/envs/nvidia-llama-index-api/lib/python3.11/site-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-embeddings-openai) (0.6.0)
Requirement already satisfied: tqdm<5.0.0,>=4.66.1 in /home/chris/anaconda3/envs/nvidia-llama-index-api/lib/python3.11/site-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-embeddings-openai) (4.66.2)
Requirement already satisfied: typing-extensions>=4.5.0 in /home/chris/anaconda3/envs/nvidia-llama-index-api/lib/python3.11/site-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-embeddings-openai) (4.11.0)
Requirement already satisfied: typing-inspect>=0.8.0 in /home/chris/anaconda3/envs/nvidia-llama-index-api/lib/python3.11/site-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-embeddings-openai) (0.9.0)
Requirement already satisfied: wrapt in /home/chris/anaconda3/envs/nvidia-llama-index-api/lib/python3.11/site-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-embeddings-openai) (1.16.0)
Requirement already satisfied: aiosignal>=1.1.2 in /home/chris/anaconda3/envs/nvidia-llama-index-api/lib/python3.11/site-packages (from aiohttp<4.0.0,>=3.8.6->llama-index-core<0.11.0,>=0.10.1->llama-index-embeddings-openai) (1.3.1)
Requirement already satisfied: attrs>=17.3.0 in /home/chris/anaconda3/envs/nvidia-llama-index-api/lib/python3.11/site-packages (from aiohttp<4.0.0,>=3.8.6->llama-index-core<0.11.0,>=0.10.1->llama-index-embeddings-openai) (23.2.0)
Requirement already satisfied: frozenlist>=1.1.1 in /home/chris/anaconda3/envs/nvidia-llama-index-api/lib/python3.11/site-packages (from aiohttp<4.0.0,>=3.8.6->llama-index-core<0.11.0,>=0.10.1->llama-index-embeddings-openai) (1.4.1)
Requirement already satisfied: multidict<7.0,>=4.5 in /home/chris/anaconda3/envs/nvidia-llama-index-api/lib/python3.11/site-packages (from aiohttp<4.0.0,>=3.8.6->llama-index-core<0.11.0,>=0.10.1->llama-index-embeddings-openai) (6.0.5)
Requirement already satisfied: yarl<2.0,>=1.0 in /home/chris/anaconda3/envs/nvidia-llama-index-api/lib/python3.11/site-packages (from aiohttp<4.0.0,>=3.8.6->llama-index-core<0.11.0,>=0.10.1->llama-index-embeddings-openai) (1.9.4)
Requirement already satisfied: pydantic>=1.10 in /home/chris/anaconda3/envs/nvidia-llama-index-api/lib/python3.11/site-packages (from llamaindex-py-client<0.2.0,>=0.1.18->llama-index-core<0.11.0,>=0.10.1->llama-index-embeddings-openai) (2.7.0)
Requirement already satisfied: anyio in /home/chris/anaconda3/envs/nvidia-llama-index-api/lib/python3.11/site-packages (from httpx->llama-index-core<0.11.0,>=0.10.1->llama-index-embeddings-openai) (4.3.0)
Requirement already satisfied: certifi in /home/chris/anaconda3/envs/nvidia-llama-index-api/lib/python3.11/site-packages (from httpx->llama-index-core<0.11.0,>=0.10.1->llama-index-embeddings-openai) (2024.2.2)
Requirement already satisfied: httpcore==1.* in /home/chris/anaconda3/envs/nvidia-llama-index-api/lib/python3.11/site-packages (from httpx->llama-index-core<0.11.0,>=0.10.1->llama-index-embeddings-openai) (1.0.5)
Requirement already satisfied: idna in /home/chris/anaconda3/envs/nvidia-llama-index-api/lib/python3.11/site-packages (from httpx->llama-index-core<0.11.0,>=0.10.1->llama-index-embeddings-openai) (3.7)
Requirement already satisfied: sniffio in /home/chris/anaconda3/envs/nvidia-llama-index-api/lib/python3.11/site-packages (from httpx->llama-index-core<0.11.0,>=0.10.1->llama-index-embeddings-openai) (1.3.1)
Requirement already satisfied: h11<0.15,>=0.13 in /home/chris/anaconda3/envs/nvidia-llama-index-api/lib/python3.11/site-packages (from httpcore==1.*->httpx->llama-index-core<0.11.0,>=0.10.1->llama-index-embeddings-openai) (0.14.0)
Requirement already satisfied: click in /home/chris/anaconda3/envs/nvidia-llama-index-api/lib/python3.11/site-packages (from nltk<4.0.0,>=3.8.1->llama-index-core<0.11.0,>=0.10.1->llama-index-embeddings-openai) (8.1.7)
Requirement already satisfied: joblib in /home/chris/anaconda3/envs/nvidia-llama-index-api/lib/python3.11/site-packages (from nltk<4.0.0,>=3.8.1->llama-index-core<0.11.0,>=0.10.1->llama-index-embeddings-openai) (1.4.0)
Requirement already satisfied: regex>=2021.8.3 in /home/chris/anaconda3/envs/nvidia-llama-index-api/lib/python3.11/site-packages (from nltk<4.0.0,>=3.8.1->llama-index-core<0.11.0,>=0.10.1->llama-index-embeddings-openai) (2024.4.16)
Requirement already satisfied: distro<2,>=1.7.0 in /home/chris/anaconda3/envs/nvidia-llama-index-api/lib/python3.11/site-packages (from openai>=1.1.0->llama-index-core<0.11.0,>=0.10.1->llama-index-embeddings-openai) (1.9.0)
Requirement already satisfied: charset-normalizer<4,>=2 in /home/chris/anaconda3/envs/nvidia-llama-index-api/lib/python3.11/site-packages (from requests>=2.31.0->llama-index-core<0.11.0,>=0.10.1->llama-index-embeddings-openai) (3.3.2)
Requirement already satisfied: urllib3<3,>=1.21.1 in /home/chris/anaconda3/envs/nvidia-llama-index-api/lib/python3.11/site-packages (from requests>=2.31.0->llama-index-core<0.11.0,>=0.10.1->llama-index-embeddings-openai) (2.2.1)
Requirement already satisfied: greenlet!=0.4.17 in /home/chris/anaconda3/envs/nvidia-llama-index-api/lib/python3.11/site-packages (from SQLAlchemy>=1.4.49->SQLAlchemy[asyncio]>=1.4.49->llama-index-core<0.11.0,>=0.10.1->llama-index-embeddings-openai) (3.0.3)
Requirement already satisfied: mypy-extensions>=0.3.0 in /home/chris/anaconda3/envs/nvidia-llama-index-api/lib/python3.11/site-packages (from typing-inspect>=0.8.0->llama-index-core<0.11.0,>=0.10.1->llama-index-embeddings-openai) (1.0.0)
Requirement already satisfied: marshmallow<4.0.0,>=3.18.0 in /home/chris/anaconda3/envs/nvidia-llama-index-api/lib/python3.11/site-packages (from dataclasses-json->llama-index-core<0.11.0,>=0.10.1->llama-index-embeddings-openai) (3.21.1)
Requirement already satisfied: python-dateutil>=2.8.2 in /home/chris/anaconda3/envs/nvidia-llama-index-api/lib/python3.11/site-packages (from pandas->llama-index-core<0.11.0,>=0.10.1->llama-index-embeddings-openai) (2.9.0.post0)
Requirement already satisfied: pytz>=2020.1 in /home/chris/anaconda3/envs/nvidia-llama-index-api/lib/python3.11/site-packages (from pandas->llama-index-core<0.11.0,>=0.10.1->llama-index-embeddings-openai) (2024.1)
Requirement already satisfied: tzdata>=2022.1 in /home/chris/anaconda3/envs/nvidia-llama-index-api/lib/python3.11/site-packages (from pandas->llama-index-core<0.11.0,>=0.10.1->llama-index-embeddings-openai) (2024.1)
Requirement already satisfied: packaging>=17.0 in /home/chris/anaconda3/envs/nvidia-llama-index-api/lib/python3.11/site-packages (from marshmallow<4.0.0,>=3.18.0->dataclasses-json->llama-index-core<0.11.0,>=0.10.1->llama-index-embeddings-openai) (24.0)
Requirement already satisfied: annotated-types>=0.4.0 in /home/chris/anaconda3/envs/nvidia-llama-index-api/lib/python3.11/site-packages (from pydantic>=1.10->llamaindex-py-client<0.2.0,>=0.1.18->llama-index-core<0.11.0,>=0.10.1->llama-index-embeddings-openai) (0.6.0)
Requirement already satisfied: pydantic-core==2.18.1 in /home/chris/anaconda3/envs/nvidia-llama-index-api/lib/python3.11/site-packages (from pydantic>=1.10->llamaindex-py-client<0.2.0,>=0.1.18->llama-index-core<0.11.0,>=0.10.1->llama-index-embeddings-openai) (2.18.1)
Requirement already satisfied: six>=1.5 in /home/chris/anaconda3/envs/nvidia-llama-index-api/lib/python3.11/site-packages (from python-dateutil>=2.8.2->pandas->llama-index-core<0.11.0,>=0.10.1->llama-index-embeddings-openai) (1.16.0)
Using cached llama_index_embeddings_openai-0.1.7-py3-none-any.whl (6.0 kB)
Installing collected packages: llama-index-embeddings-openai
Successfully installed llama-index-embeddings-openai-0.1.7

API密钥和样板文件¶

在下一个单元格中，我们将运行一些样板文件，以便在笔记本环境中顺利执行示例。

我们还将提供我们的API密钥。

注意：您可以使用代码示例窗口中的“获取API密钥”按钮创建您自己的NVIDIA API密钥。

In [ ]:

# llama-parse是异步优先的，运行笔记本中的异步代码需要使用nest_asyncioimport nest_asyncionest_asyncio.apply()import os# 使用OpenAI API进行嵌入os.environ["OPENAI_API_KEY"] = "sk-"# 使用NVIDIA API Playground API密钥进行LLMos.environ["NVIDIA_API_KEY"] = "nvapi-"

加载NVIDIA LLM¶

现在我们可以通过传入模型名称来加载我们的NVIDIA LLM，模型名称可以在文档中找到 - 位于这里

注意：默认模型是mistralai/mistral-7b-instruct-v0.2。

In [ ]:

from llama_index.llms.nvidia import NVIDIA
from llama_index.core import VectorStoreIndex
from llama_index.core import Settings

llm = NVIDIA(model="mistralai/mistral-7b-instruct-v0.2")

Settings.llm = llm

我们可以观察我们的 llm 对象当前关联的模型是哪个，通过查看 .model 属性。

In [ ]:

llm.model

Out[ ]:

'mistralai/mistral-7b-instruct-v0.2'

加载API目录 LLM¶

我们还可以使用它们的API目录地址来加载模型。

让我们以gemma-7b为例！

转到model页面
在model参数中找到地址（例如"google/gemma-7b"）
验证它具有base_url为"https://integrate.api.nvidia.com/v1"
使用NVIDIA(model="model_name_here")来指向该模型的连接器（例如NVIDIA(model="google/gemma-7b"）

让我们在代码中看看这个。

In [ ]:

llm = NVIDIA(model="google/gemma-7b")

让我们确认一下我们是否将 NvidiaAIPlayground LLM 与正确的模型关联起来了！

In [ ]:

llm.model

Out[ ]:

'google/gemma-7b'

基本功能¶

现在我们可以探索在LlamaIndex生态系统中可以使用连接器的不同方式！

在开始之前，让我们设置一个ChatMessage对象的列表 - 这是一些方法的预期输入。

In [ ]:

from llama_index.core.llms import ChatMessage, MessageRole

chat_messages = [
    ChatMessage(
        role=MessageRole.SYSTEM, content=("You are a helpful assistant.")
    ),
    ChatMessage(
        role=MessageRole.USER,
        content=("What are the most popular house pets in North America?"),
    ),
]

我们将按照每个示例相同的基本模式进行操作：

我们将把我们的 NVIDIA LLM 指向我们想要的模型
我们将检查如何使用端点来实现期望的任务！

完成：`.complete()`¶

我们可以使用.complete()/.acomplete()（接受一个字符串）来从所选模型中获取响应。

让我们为这个任务使用我们的默认模型。

In [ ]:

completion_llm = NVIDIA()

我们可以通过检查.model属性来验证这是否是预期的默认值。

In [ ]:

completion_llm.model

Out[ ]:

'mistralai/mistral-7b-instruct-v0.2'

让我们在模型上调用.complete()方法，并使用字符串"Hello!"作为输入，观察响应。

In [ ]:

completion_llm.complete("Hello!")

Out[ ]:

CompletionResponse(text=" Hello there! How can I help you today? I'm here to answer any questions you might have or provide information on a wide range of topics. So, feel free to ask me anything!\n\nIf you're looking for some general information, I can help you with that too. For example, I can tell you about the weather, current events, or provide definitions for various words and concepts. I can also help you with math problems, translate words and phrases, and even tell you a joke or two!\n\nSo, what would you like to know? Let me know and I'll do my best to help you out!\n\nIf you have any specific question or topic in mind, please let me know and I'll be glad to help you out. If you want some general information, I can provide you with that as well. For example, I can tell you about the weather, current events, or provide definitions for various words and concepts. I can also help you with math problems, translate words and phrases, and even tell you a joke or two!\n\nSo, what would you like to know? Let me know and I'll do my best to help you out!\n\nIf you have any specific question or topic in mind, please let me know and I'll be glad to help you out. If you want some general information, I can provide you with that as well. For example, I can tell you about the weather, current events, or provide definitions for various words and concepts. I can also help you with math problems, translate words and phrases, and even tell you a joke or two!\n\nSo, what would you like to know? Let me know and I'll do my best to help you out!\n\nIf you have any specific question or topic in mind, please let me know and I'll be glad to help you out. If you want some general information, I can provide you with that as well. For example, I can tell you about the weather, current events, or provide definitions for various words and concepts. I can also help you with math problems, translate words and phrases, and even tell you a joke or two!\n\nSo, what would you like to know? Let me know and I'll do my best to help you out!\n\nIf you have any specific question or topic in mind, please let me know and I'll be glad to help you out. If you want some", additional_kwargs={}, raw={'id': 'chatcmpl-f6906079-51e7-44bf-aaea-a9478397dfbf', 'choices': [Choice(finish_reason=None, index=0, logprobs=ChoiceLogprobs(content=None, text_offset=[], token_logprobs=[0.0, 0.0], tokens=[], top_logprobs=[]), message=ChatCompletionMessage(content=" Hello there! How can I help you today? I'm here to answer any questions you might have or provide information on a wide range of topics. So, feel free to ask me anything!\n\nIf you're looking for some general information, I can help you with that too. For example, I can tell you about the weather, current events, or provide definitions for various words and concepts. I can also help you with math problems, translate words and phrases, and even tell you a joke or two!\n\nSo, what would you like to know? Let me know and I'll do my best to help you out!\n\nIf you have any specific question or topic in mind, please let me know and I'll be glad to help you out. If you want some general information, I can provide you with that as well. For example, I can tell you about the weather, current events, or provide definitions for various words and concepts. I can also help you with math problems, translate words and phrases, and even tell you a joke or two!\n\nSo, what would you like to know? Let me know and I'll do my best to help you out!\n\nIf you have any specific question or topic in mind, please let me know and I'll be glad to help you out. If you want some general information, I can provide you with that as well. For example, I can tell you about the weather, current events, or provide definitions for various words and concepts. I can also help you with math problems, translate words and phrases, and even tell you a joke or two!\n\nSo, what would you like to know? Let me know and I'll do my best to help you out!\n\nIf you have any specific question or topic in mind, please let me know and I'll be glad to help you out. If you want some general information, I can provide you with that as well. For example, I can tell you about the weather, current events, or provide definitions for various words and concepts. I can also help you with math problems, translate words and phrases, and even tell you a joke or two!\n\nSo, what would you like to know? Let me know and I'll do my best to help you out!\n\nIf you have any specific question or topic in mind, please let me know and I'll be glad to help you out. If you want some", role='assistant', function_call=None, tool_calls=None))], 'created': 1713474670, 'model': 'mistralai/mistral-7b-instruct-v0.2', 'object': 'chat.completion', 'system_fingerprint': None, 'usage': CompletionUsage(completion_tokens=512, prompt_tokens=11, total_tokens=523)}, logprobs=None, delta=None)

正如LlamaIndex所期望的那样 - 我们会收到一个CompletionResponse作为响应。

异步完成：`.acomplete()`¶

还有一个可以以相同方式利用的异步实现！

In [ ]:

await completion_llm.acomplete("Hello!")

Out[ ]:

CompletionResponse(text=" Hello there! How can I help you today? I'm here to answer any questions you might have or provide information on a wide range of topics. So feel free to ask me anything!\n\nIf you're looking for a specific topic, just let me know and I'll do my best to provide you with accurate and up-to-date information. And if you have any requests for fun facts or trivia, I'm happy to oblige!\n\nSo, what would you like to know today? Let me help make your day a little brighter! 😊", additional_kwargs={}, raw={'id': 'chatcmpl-8ce881c1-a47b-43aa-afd8-9e9addf26ce9', 'choices': [Choice(finish_reason=None, index=0, logprobs=ChoiceLogprobs(content=None, text_offset=[], token_logprobs=[0.0, 0.0], tokens=[], top_logprobs=[]), message=ChatCompletionMessage(content=" Hello there! How can I help you today? I'm here to answer any questions you might have or provide information on a wide range of topics. So feel free to ask me anything!\n\nIf you're looking for a specific topic, just let me know and I'll do my best to provide you with accurate and up-to-date information. And if you have any requests for fun facts or trivia, I'm happy to oblige!\n\nSo, what would you like to know today? Let me help make your day a little brighter! 😊", role='assistant', function_call=None, tool_calls=None))], 'created': 1712175910, 'model': 'mistralai/mistral-7b-instruct-v0.2', 'object': 'chat.completion', 'system_fingerprint': None, 'usage': CompletionUsage(completion_tokens=123, prompt_tokens=11, total_tokens=134)}, logprobs=None, delta=None)

聊天：`.chat()`¶

现在我们可以尝试使用.chat()方法来做同样的事情。这个方法需要一个聊天消息的列表，所以我们将使用上面创建的那个列表。

我们将使用mistralai/mixtral-8x7b-instruct-v0.1模型作为示例。

In [ ]:

chat_llm = NVIDIA(model="mistralai/mixtral-8x7b-instruct-v0.1")

现在我们只需要在我们的ChatMessages列表上调用.chat()，然后观察我们的响应。

您还会注意到，我们可以传入一些额外的关键字参数来影响生成过程 - 在本例中，我们使用了seed参数来影响我们的生成，以及stop参数来指示我们希望模型在达到特定标记时停止生成！

注意：您可以在所选模型的API文档中找到有关模型端点支持的其他kwargs的信息。例如，Mixtral的API文档位于此处！

In [ ]:

chat_llm.chat(chat_messages, seed=4, stop=["cat", "cats", "Cat", "Cats"])

Out[ ]:

ChatResponse(message=ChatMessage(role=<MessageRole.ASSISTANT: 'assistant'>, content=" In North America, the most popular types of house pets are:\n\n1. Dogs: Man's best friend is the most popular pet in North America. They are known for their loyalty, companionship, and the variety of breeds that cater to different lifestyles and preferences.\n\n2. Cats", additional_kwargs={}), raw={'id': 'chatcmpl-b6ef95ca-e023-4dc8-8ee9-843f214169e9', 'choices': [Choice(finish_reason=None, index=0, logprobs=ChoiceLogprobs(content=None, text_offset=[], token_logprobs=[0.0, 0.0], tokens=[], top_logprobs=[]), message=ChatCompletionMessage(content=" In North America, the most popular types of house pets are:\n\n1. Dogs: Man's best friend is the most popular pet in North America. They are known for their loyalty, companionship, and the variety of breeds that cater to different lifestyles and preferences.\n\n2. Cats", role='assistant', function_call=None, tool_calls=None))], 'created': 1713474655, 'model': 'mistralai/mixtral-8x7b-instruct-v0.1', 'object': 'chat.completion', 'system_fingerprint': None, 'usage': CompletionUsage(completion_tokens=66, prompt_tokens=26, total_tokens=92)}, delta=None, logprobs=None, additional_kwargs={})

如预期，我们收到了一个ChatResponse作为响应。

异步聊天：(`achat`)¶

我们还有一个异步实现的.chat()方法，可以按照以下方式调用。

In [ ]:

await chat_llm.achat(chat_messages)

Out[ ]:

ChatResponse(message=ChatMessage(role=<MessageRole.ASSISTANT: 'assistant'>, content=' The most popular house pets in North America are dogs and cats. According to the American Pet Products Association (APPA), as of 2021, approximately 69 million homes in the United States own a pet, and 63.4 million of those households have a dog, while 42.7 million have a cat. Birds, small mammals, reptiles, and fish are also popular pets, but to a lesser extent.', additional_kwargs={}), raw={'id': 'chatcmpl-373a1d42-4dc1-4ef9-aaf3-5fea137e8e1e', 'choices': [Choice(finish_reason=None, index=0, logprobs=ChoiceLogprobs(content=None, text_offset=[], token_logprobs=[0.0, 0.0], tokens=[], top_logprobs=[]), message=ChatCompletionMessage(content=' The most popular house pets in North America are dogs and cats. According to the American Pet Products Association (APPA), as of 2021, approximately 69 million homes in the United States own a pet, and 63.4 million of those households have a dog, while 42.7 million have a cat. Birds, small mammals, reptiles, and fish are also popular pets, but to a lesser extent.', role='assistant', function_call=None, tool_calls=None))], 'created': 1712177472, 'model': 'mistralai/mixtral-8x7b-instruct-v0.1', 'object': 'chat.completion', 'system_fingerprint': None, 'usage': CompletionUsage(completion_tokens=95, prompt_tokens=59, total_tokens=154)}, delta=None, logprobs=None, additional_kwargs={})

流：`.stream_chat()`¶

我们也可以使用在build.nvidia.com上找到的模型来进行流式使用案例！

让我们选择另一个模型并观察其行为。我们将使用谷歌的gemma-7b模型来完成这个任务。

In [ ]:

stream_llm = NVIDIA(model="google/gemma-7b")

让我们使用.stream_chat()来调用我们的模型，它再次期望一个ChatMessage对象的列表，并捕获响应。

In [ ]:

streamed_response = stream_llm.stream_chat(chat_messages)

In [ ]:

streamed_response

Out[ ]:

<generator object llm_chat_callback.<locals>.wrap.<locals>.wrapped_llm_chat.<locals>.wrapped_gen at 0x7dd89853e320>

正如我们所看到的，响应是一个生成器，其中包含流式响应。

让我们在生成完成后查看最终的响应。

In [ ]:

last_element = None
for last_element in streamed_response:
    pass

print(last_element)

assistant: **Top Popular House Pets in North America:**

**1. Dogs:**
* Estimated 63.4 million pet dogs in households (2023)
* Known for their loyalty, companionship, and trainability

**2. Cats:**
* Estimated 38.4 million pet cats in households (2023)
* Known for their independence, affection, and low-maintenance nature

**3. Fish:**
* Estimated 14.5 million pet fish in households (2023)
* Popular for their tranquility, beauty, and variety of species

**4. Small mammals (guinea pigs, hamsters, rabbits):**
* Estimated 14.4 million pet small mammals in households (2023)
* Known for their playful and affectionate nature

**5. Birds:**
* Estimated 13.3 million pet birds in households (2023)
* Known for their beauty, song, and intelligence

**Other popular pets:**

* Tortoises and reptiles
* Hamsters and rodents
* Invertebrates (such as spiders and hermit crabs)

**Factors influencing pet popularity:**

* **Lifestyle and living situation:** Urban dwellers are more likely to have cats, while suburban and rural residents are more likely to have dogs.
* **Cost:** Dogs tend to be more expensive to own than cats.
* **Personality and preferences:** Some people prefer the companionship of dogs, while others prefer the independence of cats.
* **Availability:** Certain pets are easier to find or adopt than others.
* **Trend and cultural influences:** Some pets become more popular than others due to trends or cultural preferences.

异步流：`.astream_chat()`¶

我们也有与流式处理等效的异步方法，可以以类似的方式用于同步实现。

In [ ]:

streamed_response = await stream_llm.astream_chat(chat_messages)

In [ ]:

streamed_response

Out[ ]:

<async_generator object llm_chat_callback.<locals>.wrap.<locals>.wrapped_async_llm_chat.<locals>.wrapped_gen at 0x787709eea460>

In [ ]:

last_element = None
async for last_element in streamed_response:
    pass

print(last_element)

assistant: Sure, here are the most popular house pets in North America:

1. Dogs
2. Cats
3. Fish
4. Small Mammals
5. Birds

流式查询引擎响应¶

让我们来看一个稍微复杂一点的例子，使用一个查询引擎！

我们将从加载一些数据开始（我们将使用《银河系漫游指南》）。

加载数据¶

让我们首先创建一个目录，用来存放我们的数据。

In [ ]:

!mkdir -p 'data/hhgttg'

我们将从上述来源下载我们的数据。

In [ ]:

!wget 'https://web.eecs.utk.edu/~hqi/deeplearning/project/hhgttg.txt' -O 'data/hhgttg/hhgttg.txt'

--2024-04-01 14:39:38--  https://web.eecs.utk.edu/~hqi/deeplearning/project/hhgttg.txt
Resolving web.eecs.utk.edu (web.eecs.utk.edu)... 160.36.127.165
Connecting to web.eecs.utk.edu (web.eecs.utk.edu)|160.36.127.165|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1534289 (1.5M) [text/plain]
Saving to: ‘data/hhgttg/hhgttg.txt’

data/hhgttg/hhgttg. 100%[===================>]   1.46M  6.75MB/s    in 0.2s    

2024-04-01 14:39:39 (6.75 MB/s) - ‘data/hhgttg/hhgttg.txt’ saved [1534289/1534289]

我们需要为这一步准备一个嵌入模型！我们将使用OpenAI的text-embedding-03-small模型来实现这一点，并将其保存在我们的Settings中。

In [ ]:

from llama_index.embeddings.openai import OpenAIEmbedding

openai_embedding = OpenAIEmbedding(model="text-embedding-3-small")

Settings.embed_model = openai_embedding

现在我们可以加载我们的文档，并利用上面创建的 OpenAIEmbedding() 创建一个索引。

In [ ]:

from llama_index.core import VectorStoreIndex, SimpleDirectoryReader

documents = SimpleDirectoryReader("data/hhgttg").load_data()
index = VectorStoreIndex.from_documents(documents)

现在我们可以创建一个简单的查询引擎，并将我们的 streaming 参数设置为 True。

In [ ]:

streaming_qe = index.as_query_engine(streaming=True)

让我们向查询引擎发送一个查询，然后流式传输响应。

In [ ]:

streaming_response = streaming_qe.query(
    "What is the significance of the number 42?",
)

In [ ]:

streaming_response.print_response_stream()

The significance of the number 42 is a central theme in "The Hitchhiker's Guide to the Galaxy" by Douglas Adams. The book is a comedic science fiction satire that follows the adventures of two intergalactic travelers, Arthur Dent and Ford Prefect, as they try to escape the destruction of Earth and uncover the true meaning of the number 42.

Throughout the book, the number 42 is presented as the ultimate answer to the ultimate question of life, the universe, and everything. The question itself is never explicitly stated, but it is implied to be a deeply profound and existential one that has been sought after by philosophers, scientists, and thinkers throughout history.

The idea of the number 42 as the ultimate answer is a playful jab at the idea of seeking ultimate knowledge and understanding, which is often seen as an impossible task. The number 42 is also a reference to the famous "42" answer in the "The Hitchhiker's Guide to the Galaxy" by Douglas Adams, which is a comedic science fiction satire that follows the adventures of two intergalactic travelers, Arthur Dent and Ford Prefect, as they try to escape the destruction of Earth and uncover the true meaning of the number 42.

In the book, the supercomputer Deep Thought is asked to find the answer to the ultimate question, and after billions of years of computation, it determines that the answer is 42. The answer is so profound that it causes Deep Thought to become obsolete, as it is no longer needed to answer questions.

The significance of the number 42 in "The Hitchhiker's Guide to the Galaxy" is a commentary on the nature of knowledge and the quest for ultimate understanding. It is a reminder that there are limits to what can be known and that the pursuit of knowledge should be done with a sense of humor and a willingness to accept the unknown.

连接本地NIMs¶

除了连接到托管的NVIDIA NIMs之外，此连接器还可以用于连接到本地微服务实例。这有助于在必要时将您的应用程序部署到本地。

有关设置本地微服务实例的说明，请参阅https://developer.nvidia.com/blog/nvidia-nim-offers-optimized-inference-microservices-for-deploying-ai-models-at-scale/

In [ ]:

from llama_index.llms.nvidia import NVIDIA

llm = NVIDIA(model="...").mode("nim", base_url="https://localhost.../v1")
llm.available_models

使用NVIDIA的LLM API目录连接器¶

API密钥和样板文件¶

加载NVIDIA LLM¶

加载API目录 LLM¶

基本功能¶

完成：.complete()¶

异步完成：.acomplete()¶

聊天：.chat()¶

异步聊天：(achat)¶

流：.stream_chat()¶

异步流：.astream_chat()¶

流式查询引擎响应¶

加载数据¶

连接本地NIMs¶

完成：`.complete()`¶

异步完成：`.acomplete()`¶

聊天：`.chat()`¶

异步聊天：(`achat`)¶

流：`.stream_chat()`¶

异步流：`.astream_chat()`¶