RunGPT¶

RunGPT是一个开源的云原生大规模多模态模型（LMMs）服务框架。它旨在简化大型语言模型在分布式GPU集群上的部署和管理。RunGPT的目标是将其打造成一个集中且易于访问的地方，汇集优化大规模多模态模型的技术，并使其易于为所有人使用的一站式解决方案。在RunGPT中，我们已经支持了许多LLMs，如LLaMA、Pythia、StableLM、Vicuna、MOSS，以及像MiniGPT-4和OpenFlamingo这样的大型多模态模型（LMMs）。

# 导入所需的库
import numpy as np
import pandas as pd

如果您在colab上打开这个笔记本，您可能需要安装LlamaIndex 🦙。

In [ ]:

%pip install llama-index-llms-rungpt

In [ ]:

!pip install llama-index

您需要在Python环境中使用pip install安装rungpt包。

In [ ]:

!pip install rungpt

安装成功后，RunGPT支持的模型可以通过一行命令进行部署。这个选项会从开源平台下载目标语言模型，并将其部署为一个服务在本地端口，可以通过http或grpc请求进行访问。我假设你不会在jupyter笔记本中运行这个命令，而是在命令行中运行。

In [ ]:

!rungpt serve decapoda-research/llama-7b-hf --precision fp16 --device_map balanced

基本用法¶

使用提示调用`complete`¶

In [ ]:

from llama_index.llms.rungpt import RunGptLLM

llm = RunGptLLM()
promot = "What public transportation might be available in a city?"
response = llm.complete(promot)

In [ ]:

print(response)

I don't want to go to work, so what should I do?
I have a job interview on Monday. What can I wear that will make me look professional but not too stuffy or boring?

使用消息列表调用`chat`¶

In [ ]:

from llama_index.core.llms import ChatMessage, MessageRole
from llama_index.llms.rungpt import RunGptLLM

messages = [
    ChatMessage(
        role=MessageRole.USER,
        content="Now, I want you to do some math for me.",
    ),
    ChatMessage(
        role=MessageRole.ASSISTANT, content="Sure, I would like to help you."
    ),
    ChatMessage(
        role=MessageRole.USER,
        content="How many points determine a straight line?",
    ),
]
llm = RunGptLLM()
response = llm.chat(messages=messages, temperature=0.8, max_tokens=15)

In [ ]:

print(response)

流式处理是一种处理数据的方法，它允许我们在数据到达时立即处理它，而不需要等待所有数据都可用后再进行处理。这种方法特别适用于处理大量数据或实时数据。在Python中，我们可以使用各种库和工具来实现流式处理，如pandas、Dask和Spark等。

使用 stream_complete 终端点

In [ ]:

promot = "What public transportation might be available in a city?"
response = RunGptLLM().stream_complete(promot)
for item in response:
    print(item.text)

使用 stream_chat 端点

In [ ]:

from llama_index.llms.rungpt import RunGptLLM

messages = [
    ChatMessage(
        role=MessageRole.USER,
        content="Now, I want you to do some math for me.",
    ),
    ChatMessage(
        role=MessageRole.ASSISTANT, content="Sure, I would like to help you."
    ),
    ChatMessage(
        role=MessageRole.USER,
        content="How many points determine a straight line?",
    ),
]
response = RunGptLLM().stream_chat(messages=messages)

In [ ]:

for item in response:
    print(item.message)

RunGPT¶

基本用法¶

使用提示调用complete¶

使用消息列表调用chat¶

使用提示调用`complete`¶

使用消息列表调用`chat`¶