Notebook

统一¶

统一动态地将每个查询路由到最佳的LLM，支持OpenAI、MistralAI、Perplexity AI和Together AI等提供商。您还可以使用单个API密钥访问所有提供商。

您可以查看我们的实时基准测试以了解数据的来源！

安装说明¶

首先，让我们安装LlamaIndex 🦙和Unify集成。

In [ ]:

%pip install llama-index-llms-unify llama-index

环境设置¶

确保设置UNIFY_API_KEY环境变量。您可以在Unify控制台中获取密钥。

In [ ]:

import os

os.environ["UNIFY_API_KEY"] = "<YOUR API KEY>"

使用LlamaIndex与Unify¶

LlamaIndex是一个用于索引和搜索文本数据的工具，而Unify是一个用于数据整合和清洗的工具。在这个示例中，我们将展示如何使用LlamaIndex和Unify来处理文本数据。

路由请求¶

我们可以做的第一件事是初始化并查询一个聊天模型。要配置Unify的路由器，可以将一个端点字符串传递给 Unify。您可以在Unify文档中了解更多信息。

在这种情况下，我们将使用在输入成本方面最便宜的 llama2-70b 端点，然后使用 complete。

In [ ]:

from llama_index.llms.unify import Unify

llm = Unify(model="llama-2-70b-chat@dinput-cost")
llm.complete("How are you today, llama?")

Out[ ]:

CompletionResponse(text="  I'm doing well, thanks for asking! It's always a pleasure to chat with you. I hope you're having a great day too! Is there anything specific you'd like to talk about or ask me? I'm here to help with any questions you might have.", additional_kwargs={}, raw={'id': 'meta-llama/Llama-2-70b-chat-hf-b90de288-1927-4f32-9ecb-368983c45321', 'choices': [Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content="  I'm doing well, thanks for asking! It's always a pleasure to chat with you. I hope you're having a great day too! Is there anything specific you'd like to talk about or ask me? I'm here to help with any questions you might have.", role='assistant', function_call=None, tool_calls=None, tool_call_id=None))], 'created': 1711047739, 'model': 'llama-2-70b-chat@anyscale', 'object': 'chat.completion', 'system_fingerprint': None, 'usage': CompletionUsage(completion_tokens=62, prompt_tokens=16, total_tokens=78, cost=7.8e-05)}, logprobs=None, delta=None)

单点登录¶

如果您不希望路由器选择提供者，您也可以使用我们的SSO在不与所有提供者建立帐户的情况下查询不同提供者的端点。例如，以下所有端点都是有效的：

In [ ]:

llm = Unify(model="llama-2-70b-chat@together-ai")
llm = Unify(model="gpt-3.5-turbo@openai")
llm = Unify(model="mixtral-8x7b-instruct-v0.1@mistral-ai")

这样可以让您快速切换和测试不同的模型和提供商。例如，如果您正在开发一个使用gpt-4作为核心的应用程序，您可以在开发和/或测试过程中使用这个功能来查询成本更低的LLM，以降低成本。

在这里查看可用的内容！

流式传输和优化延迟¶

如果您正在构建一个对响应速度要求很高的应用程序，您很可能希望获得一个流式响应。此外，理想情况下，您应该使用具有最低“首个令牌时间”的提供者，以减少用户等待响应的时间。在Unify中，这看起来应该是这样的：

In [ ]:

llm = Unify(model="mixtral-8x7b-instruct-v0.1@ttft")

response = llm.stream_complete(
    "Translate the following to German: "
    "Hey, there's an emergency in translation street, "
    "please send help asap!"
)

In [ ]:

show_provider = True
for r in response:
    if show_provider:
        print(f"Model and provider are : {r.raw['model']}\n")
        show_provider = False
    print(r.delta, end="", flush=True)

Model and provider are : mixtral-8x7b-instruct-v0.1@mistral-ai

Hallo, es gibt einen Notfall in der Übersetzungsstraße, bitte senden Sie Hilfe so schnell wie möglich!

(Note: This is a literal translation and the term "Übersetzungsstraße" is not a standard or commonly used term in German. A more natural way to express the idea of a "emergency in translation" could be "Notfall bei Übersetzungen" or "akute Übersetzungsnotwendigkeit".)

异步调用和最低输入成本¶

最后但并非最不重要的是，您还可以异步运行请求。对于长文档摘要等任务，优化输入成本至关重要。Unify的动态路由器也可以做到这一点！

In [ ]:

llm = Unify(model="mixtral-8x7b-instruct-v0.1@input-cost")

response = await llm.acomplete(
    "Summarize this in 10 words or less. OpenAI is a U.S. based artificial intelligence "
    "(AI) research organization founded in December 2015, researching artificial intelligence "
    "with the goal of developing 'safe and beneficial' artificial general intelligence, "
    "which it defines as 'highly autonomous systems that outperform humans at most economically "
    "valuable work'. As one of the leading organizations of the AI spring, it has developed "
    "several large language models, advanced image generation models, and previously, released "
    "open-source models. Its release of ChatGPT has been credited with starting the AI spring"
)

print(f"Model and provider are : {response.raw['model']}\n")
print(response)

Model and provider are : mixtral-8x7b-instruct-v0.1@deepinfra

 OpenAI: Pioneering 'safe' artificial general intelligence.