!pip install unstructured
Collecting unstructured Using cached unstructured-0.11.8-py3-none-any.whl.metadata (26 kB) Collecting chardet (from unstructured) Using cached chardet-5.2.0-py3-none-any.whl.metadata (3.4 kB) Collecting filetype (from unstructured) Using cached filetype-1.2.0-py2.py3-none-any.whl.metadata (6.5 kB) Collecting python-magic (from unstructured) Using cached python_magic-0.4.27-py2.py3-none-any.whl.metadata (5.8 kB) Collecting lxml (from unstructured) Using cached lxml-5.1.1-cp38-cp38-win_amd64.whl.metadata (3.6 kB) Collecting nltk (from unstructured) Using cached nltk-3.8.1-py3-none-any.whl.metadata (2.8 kB) Collecting tabulate (from unstructured) Using cached tabulate-0.9.0-py3-none-any.whl.metadata (34 kB) Requirement already satisfied: requests in e:\anaconda3\envs\openai\lib\site-packages (from unstructured) (2.31.0) Requirement already satisfied: beautifulsoup4 in e:\anaconda3\envs\openai\lib\site-packages (from unstructured) (4.12.3) Collecting emoji (from unstructured) Using cached emoji-2.11.0-py2.py3-none-any.whl.metadata (5.3 kB) Requirement already satisfied: dataclasses-json in e:\anaconda3\envs\openai\lib\site-packages (from unstructured) (0.6.4) Collecting python-iso639 (from unstructured) Using cached python_iso639-2024.2.7-py3-none-any.whl.metadata (13 kB) Collecting langdetect (from unstructured) Using cached langdetect-1.0.9-py3-none-any.whl Requirement already satisfied: numpy in e:\anaconda3\envs\openai\lib\site-packages (from unstructured) (1.24.4) Collecting rapidfuzz (from unstructured) Using cached rapidfuzz-3.7.0-cp38-cp38-win_amd64.whl.metadata (11 kB) Collecting backoff (from unstructured) Using cached backoff-2.2.1-py3-none-any.whl.metadata (14 kB) Requirement already satisfied: typing-extensions in e:\anaconda3\envs\openai\lib\site-packages (from unstructured) (4.9.0) Collecting unstructured-client (from unstructured) Using cached unstructured_client-0.22.0-py3-none-any.whl.metadata (7.3 kB) Collecting wrapt (from unstructured) Using cached wrapt-1.16.0-cp38-cp38-win_amd64.whl.metadata (6.8 kB) Requirement already satisfied: soupsieve>1.2 in e:\anaconda3\envs\openai\lib\site-packages (from beautifulsoup4->unstructured) (2.5) Requirement already satisfied: marshmallow<4.0.0,>=3.18.0 in e:\anaconda3\envs\openai\lib\site-packages (from dataclasses-json->unstructured) (3.21.1) Requirement already satisfied: typing-inspect<1,>=0.4.0 in e:\anaconda3\envs\openai\lib\site-packages (from dataclasses-json->unstructured) (0.9.0) Requirement already satisfied: six in e:\anaconda3\envs\openai\lib\site-packages (from langdetect->unstructured) (1.16.0) Requirement already satisfied: click in e:\anaconda3\envs\openai\lib\site-packages (from nltk->unstructured) (8.1.7) Requirement already satisfied: joblib in e:\anaconda3\envs\openai\lib\site-packages (from nltk->unstructured) (1.3.2) Requirement already satisfied: regex>=2021.8.3 in e:\anaconda3\envs\openai\lib\site-packages (from nltk->unstructured) (2023.12.25) Requirement already satisfied: tqdm in e:\anaconda3\envs\openai\lib\site-packages (from nltk->unstructured) (4.66.2) Requirement already satisfied: charset-normalizer<4,>=2 in e:\anaconda3\envs\openai\lib\site-packages (from requests->unstructured) (3.3.2) Requirement already satisfied: idna<4,>=2.5 in e:\anaconda3\envs\openai\lib\site-packages (from requests->unstructured) (3.6) Requirement already satisfied: urllib3<3,>=1.21.1 in e:\anaconda3\envs\openai\lib\site-packages (from requests->unstructured) (2.2.1) Requirement already satisfied: certifi>=2017.4.17 in e:\anaconda3\envs\openai\lib\site-packages (from requests->unstructured) (2024.2.2) Collecting deepdiff>=6.0 (from unstructured-client->unstructured) Using cached deepdiff-6.7.1-py3-none-any.whl.metadata (6.1 kB) Collecting jsonpath-python>=1.0.6 (from unstructured-client->unstructured) Using cached jsonpath_python-1.0.6-py3-none-any.whl.metadata (12 kB) Requirement already satisfied: mypy-extensions>=1.0.0 in e:\anaconda3\envs\openai\lib\site-packages (from unstructured-client->unstructured) (1.0.0) Requirement already satisfied: packaging>=23.1 in e:\anaconda3\envs\openai\lib\site-packages (from unstructured-client->unstructured) (23.2) Requirement already satisfied: pypdf>=4.0 in e:\anaconda3\envs\openai\lib\site-packages (from unstructured-client->unstructured) (4.1.0) Requirement already satisfied: python-dateutil>=2.8.2 in e:\anaconda3\envs\openai\lib\site-packages (from unstructured-client->unstructured) (2.8.2) Collecting ordered-set<4.2.0,>=4.0.2 (from deepdiff>=6.0->unstructured-client->unstructured) Using cached ordered_set-4.1.0-py3-none-any.whl.metadata (5.3 kB) Requirement already satisfied: colorama in e:\anaconda3\envs\openai\lib\site-packages (from click->nltk->unstructured) (0.4.6) Using cached unstructured-0.11.8-py3-none-any.whl (1.8 MB) Using cached backoff-2.2.1-py3-none-any.whl (15 kB) Using cached chardet-5.2.0-py3-none-any.whl (199 kB) Using cached emoji-2.11.0-py2.py3-none-any.whl (433 kB) Using cached filetype-1.2.0-py2.py3-none-any.whl (19 kB) Using cached lxml-5.1.1-cp38-cp38-win_amd64.whl (3.9 MB) Using cached nltk-3.8.1-py3-none-any.whl (1.5 MB) Using cached python_iso639-2024.2.7-py3-none-any.whl (274 kB) Using cached python_magic-0.4.27-py2.py3-none-any.whl (13 kB) Using cached rapidfuzz-3.7.0-cp38-cp38-win_amd64.whl (1.6 MB) Using cached tabulate-0.9.0-py3-none-any.whl (35 kB) Using cached unstructured_client-0.22.0-py3-none-any.whl (28 kB) Using cached wrapt-1.16.0-cp38-cp38-win_amd64.whl (37 kB) Using cached deepdiff-6.7.1-py3-none-any.whl (76 kB) Using cached jsonpath_python-1.0.6-py3-none-any.whl (7.6 kB) Using cached ordered_set-4.1.0-py3-none-any.whl (7.6 kB) Installing collected packages: filetype, wrapt, tabulate, rapidfuzz, python-magic, python-iso639, ordered-set, lxml, langdetect, jsonpath-python, emoji, chardet, backoff, nltk, deepdiff, unstructured-client, unstructured Successfully installed backoff-2.2.1 chardet-5.2.0 deepdiff-6.7.1 emoji-2.11.0 filetype-1.2.0 jsonpath-python-1.0.6 langdetect-1.0.9 lxml-5.1.1 nltk-3.8.1 ordered-set-4.1.0 python-iso639-2024.2.7 python-magic-0.4.27 rapidfuzz-3.7.0 tabulate-0.9.0 unstructured-0.11.8 unstructured-client-0.22.0 wrapt-1.16.0
WARNING: Ignoring invalid distribution -pype1 (c:\users\jyseo\appdata\roaming\python\python38\site-packages) WARNING: Ignoring invalid distribution -pype1 (c:\users\jyseo\appdata\roaming\python\python38\site-packages) WARNING: Ignoring invalid distribution -pype1 (c:\users\jyseo\appdata\roaming\python\python38\site-packages) WARNING: Ignoring invalid distribution -pype1 (c:\users\jyseo\appdata\roaming\python\python38\site-packages) WARNING: Ignoring invalid distribution -pype1 (c:\users\jyseo\appdata\roaming\python\python38\site-packages) WARNING: Ignoring invalid distribution -pype1 (c:\users\jyseo\appdata\roaming\python\python38\site-packages)
!pip install sentence-transformers
Requirement already satisfied: sentence-transformers in e:\anaconda3\envs\openai\lib\site-packages (2.6.1) Requirement already satisfied: transformers<5.0.0,>=4.32.0 in e:\anaconda3\envs\openai\lib\site-packages (from sentence-transformers) (4.39.2) Requirement already satisfied: tqdm in e:\anaconda3\envs\openai\lib\site-packages (from sentence-transformers) (4.66.2) Requirement already satisfied: torch>=1.11.0 in e:\anaconda3\envs\openai\lib\site-packages (from sentence-transformers) (2.2.2) Requirement already satisfied: numpy in e:\anaconda3\envs\openai\lib\site-packages (from sentence-transformers) (1.24.4) Requirement already satisfied: scikit-learn in e:\anaconda3\envs\openai\lib\site-packages (from sentence-transformers) (1.3.2) Requirement already satisfied: scipy in e:\anaconda3\envs\openai\lib\site-packages (from sentence-transformers) (1.10.1) Requirement already satisfied: huggingface-hub>=0.15.1 in e:\anaconda3\envs\openai\lib\site-packages (from sentence-transformers) (0.22.2) Requirement already satisfied: Pillow in e:\anaconda3\envs\openai\lib\site-packages (from sentence-transformers) (10.2.0) Requirement already satisfied: filelock in e:\anaconda3\envs\openai\lib\site-packages (from huggingface-hub>=0.15.1->sentence-transformers) (3.13.3) Requirement already satisfied: fsspec>=2023.5.0 in e:\anaconda3\envs\openai\lib\site-packages (from huggingface-hub>=0.15.1->sentence-transformers) (2024.3.1) Requirement already satisfied: packaging>=20.9 in e:\anaconda3\envs\openai\lib\site-packages (from huggingface-hub>=0.15.1->sentence-transformers) (23.2) Requirement already satisfied: pyyaml>=5.1 in e:\anaconda3\envs\openai\lib\site-packages (from huggingface-hub>=0.15.1->sentence-transformers) (6.0.1) Requirement already satisfied: requests in e:\anaconda3\envs\openai\lib\site-packages (from huggingface-hub>=0.15.1->sentence-transformers) (2.31.0) Requirement already satisfied: typing-extensions>=3.7.4.3 in e:\anaconda3\envs\openai\lib\site-packages (from huggingface-hub>=0.15.1->sentence-transformers) (4.9.0) Requirement already satisfied: sympy in e:\anaconda3\envs\openai\lib\site-packages (from torch>=1.11.0->sentence-transformers) (1.12) Requirement already satisfied: networkx in e:\anaconda3\envs\openai\lib\site-packages (from torch>=1.11.0->sentence-transformers) (3.1) Requirement already satisfied: jinja2 in e:\anaconda3\envs\openai\lib\site-packages (from torch>=1.11.0->sentence-transformers) (3.1.3) Requirement already satisfied: colorama in e:\anaconda3\envs\openai\lib\site-packages (from tqdm->sentence-transformers) (0.4.6) Requirement already satisfied: regex!=2019.12.17 in e:\anaconda3\envs\openai\lib\site-packages (from transformers<5.0.0,>=4.32.0->sentence-transformers) (2023.12.25) Requirement already satisfied: tokenizers<0.19,>=0.14 in e:\anaconda3\envs\openai\lib\site-packages (from transformers<5.0.0,>=4.32.0->sentence-transformers) (0.15.2) Requirement already satisfied: safetensors>=0.4.1 in e:\anaconda3\envs\openai\lib\site-packages (from transformers<5.0.0,>=4.32.0->sentence-transformers) (0.4.2) Requirement already satisfied: joblib>=1.1.1 in e:\anaconda3\envs\openai\lib\site-packages (from scikit-learn->sentence-transformers) (1.3.2) Requirement already satisfied: threadpoolctl>=2.0.0 in e:\anaconda3\envs\openai\lib\site-packages (from scikit-learn->sentence-transformers) (3.4.0) Requirement already satisfied: MarkupSafe>=2.0 in e:\anaconda3\envs\openai\lib\site-packages (from jinja2->torch>=1.11.0->sentence-transformers) (2.1.5) Requirement already satisfied: charset-normalizer<4,>=2 in e:\anaconda3\envs\openai\lib\site-packages (from requests->huggingface-hub>=0.15.1->sentence-transformers) (3.3.2) Requirement already satisfied: idna<4,>=2.5 in e:\anaconda3\envs\openai\lib\site-packages (from requests->huggingface-hub>=0.15.1->sentence-transformers) (3.6) Requirement already satisfied: urllib3<3,>=1.21.1 in e:\anaconda3\envs\openai\lib\site-packages (from requests->huggingface-hub>=0.15.1->sentence-transformers) (2.2.1) Requirement already satisfied: certifi>=2017.4.17 in e:\anaconda3\envs\openai\lib\site-packages (from requests->huggingface-hub>=0.15.1->sentence-transformers) (2024.2.2) Requirement already satisfied: mpmath>=0.19 in e:\anaconda3\envs\openai\lib\site-packages (from sympy->torch>=1.11.0->sentence-transformers) (1.3.0)
WARNING: Ignoring invalid distribution -pype1 (c:\users\jyseo\appdata\roaming\python\python38\site-packages) WARNING: Ignoring invalid distribution -pype1 (c:\users\jyseo\appdata\roaming\python\python38\site-packages) WARNING: Ignoring invalid distribution -pype1 (c:\users\jyseo\appdata\roaming\python\python38\site-packages) WARNING: Ignoring invalid distribution -pype1 (c:\users\jyseo\appdata\roaming\python\python38\site-packages) WARNING: Ignoring invalid distribution -pype1 (c:\users\jyseo\appdata\roaming\python\python38\site-packages) WARNING: Ignoring invalid distribution -pype1 (c:\users\jyseo\appdata\roaming\python\python38\site-packages)
!pip install chromadb
Collecting chromadb Using cached chromadb-0.4.24-py3-none-any.whl.metadata (7.3 kB) Collecting build>=1.0.3 (from chromadb) Using cached build-1.2.1-py3-none-any.whl.metadata (4.3 kB) Requirement already satisfied: requests>=2.28 in e:\anaconda3\envs\openai\lib\site-packages (from chromadb) (2.31.0) Requirement already satisfied: pydantic>=1.9 in e:\anaconda3\envs\openai\lib\site-packages (from chromadb) (2.6.4) Collecting chroma-hnswlib==0.7.3 (from chromadb) Using cached chroma_hnswlib-0.7.3-cp38-cp38-win_amd64.whl.metadata (262 bytes) Collecting fastapi>=0.95.2 (from chromadb) Using cached fastapi-0.110.0-py3-none-any.whl.metadata (25 kB) Collecting uvicorn>=0.18.3 (from uvicorn[standard]>=0.18.3->chromadb) Using cached uvicorn-0.29.0-py3-none-any.whl.metadata (6.3 kB) Requirement already satisfied: numpy>=1.22.5 in e:\anaconda3\envs\openai\lib\site-packages (from chromadb) (1.24.4) Collecting posthog>=2.4.0 (from chromadb) Using cached posthog-3.5.0-py2.py3-none-any.whl.metadata (2.0 kB) Requirement already satisfied: typing-extensions>=4.5.0 in e:\anaconda3\envs\openai\lib\site-packages (from chromadb) (4.9.0) Collecting pulsar-client>=3.1.0 (from chromadb) Using cached pulsar_client-3.4.0-cp38-cp38-win_amd64.whl.metadata (1.0 kB) Collecting onnxruntime>=1.14.1 (from chromadb) Using cached onnxruntime-1.17.1-cp38-cp38-win_amd64.whl.metadata (4.4 kB) Collecting opentelemetry-api>=1.2.0 (from chromadb) Using cached opentelemetry_api-1.24.0-py3-none-any.whl.metadata (1.3 kB) Collecting opentelemetry-exporter-otlp-proto-grpc>=1.2.0 (from chromadb) Using cached opentelemetry_exporter_otlp_proto_grpc-1.24.0-py3-none-any.whl.metadata (2.2 kB) Collecting opentelemetry-instrumentation-fastapi>=0.41b0 (from chromadb) Using cached opentelemetry_instrumentation_fastapi-0.45b0-py3-none-any.whl.metadata (2.0 kB) Collecting opentelemetry-sdk>=1.2.0 (from chromadb) Using cached opentelemetry_sdk-1.24.0-py3-none-any.whl.metadata (1.4 kB) Requirement already satisfied: tokenizers>=0.13.2 in e:\anaconda3\envs\openai\lib\site-packages (from chromadb) (0.15.2) Collecting pypika>=0.48.9 (from chromadb) Using cached PyPika-0.48.9-py2.py3-none-any.whl Requirement already satisfied: tqdm>=4.65.0 in e:\anaconda3\envs\openai\lib\site-packages (from chromadb) (4.66.2) Collecting overrides>=7.3.1 (from chromadb) Using cached overrides-7.7.0-py3-none-any.whl.metadata (5.8 kB) Requirement already satisfied: importlib-resources in e:\anaconda3\envs\openai\lib\site-packages (from chromadb) (6.4.0) Collecting grpcio>=1.58.0 (from chromadb) Using cached grpcio-1.62.1-cp38-cp38-win_amd64.whl.metadata (4.2 kB) Collecting bcrypt>=4.0.1 (from chromadb) Using cached bcrypt-4.1.2-cp37-abi3-win_amd64.whl.metadata (9.8 kB) Collecting typer>=0.9.0 (from chromadb) Using cached typer-0.12.0-py3-none-any.whl.metadata (15 kB) Collecting kubernetes>=28.1.0 (from chromadb) Using cached kubernetes-29.0.0-py2.py3-none-any.whl.metadata (1.5 kB) Requirement already satisfied: tenacity>=8.2.3 in e:\anaconda3\envs\openai\lib\site-packages (from chromadb) (8.2.3) Requirement already satisfied: PyYAML>=6.0.0 in e:\anaconda3\envs\openai\lib\site-packages (from chromadb) (6.0.1) Collecting mmh3>=4.0.1 (from chromadb) Using cached mmh3-4.1.0-cp38-cp38-win_amd64.whl.metadata (13 kB) Requirement already satisfied: orjson>=3.9.12 in e:\anaconda3\envs\openai\lib\site-packages (from chromadb) (3.10.0) Collecting graphlib-backport>=1.0.3 (from chromadb) Using cached graphlib_backport-1.1.0-py3-none-any.whl.metadata (4.4 kB) Requirement already satisfied: packaging>=19.1 in e:\anaconda3\envs\openai\lib\site-packages (from build>=1.0.3->chromadb) (23.2) Collecting pyproject_hooks (from build>=1.0.3->chromadb) Using cached pyproject_hooks-1.0.0-py3-none-any.whl.metadata (1.3 kB) Requirement already satisfied: colorama in e:\anaconda3\envs\openai\lib\site-packages (from build>=1.0.3->chromadb) (0.4.6) Requirement already satisfied: importlib-metadata>=4.6 in e:\anaconda3\envs\openai\lib\site-packages (from build>=1.0.3->chromadb) (7.0.1) Collecting tomli>=1.1.0 (from build>=1.0.3->chromadb) Using cached tomli-2.0.1-py3-none-any.whl.metadata (8.9 kB) Collecting starlette<0.37.0,>=0.36.3 (from fastapi>=0.95.2->chromadb) Using cached starlette-0.36.3-py3-none-any.whl.metadata (5.9 kB) Requirement already satisfied: certifi>=14.05.14 in e:\anaconda3\envs\openai\lib\site-packages (from kubernetes>=28.1.0->chromadb) (2024.2.2) Requirement already satisfied: six>=1.9.0 in e:\anaconda3\envs\openai\lib\site-packages (from kubernetes>=28.1.0->chromadb) (1.16.0) Requirement already satisfied: python-dateutil>=2.5.3 in e:\anaconda3\envs\openai\lib\site-packages (from kubernetes>=28.1.0->chromadb) (2.8.2) Collecting google-auth>=1.0.1 (from kubernetes>=28.1.0->chromadb) Using cached google_auth-2.29.0-py2.py3-none-any.whl.metadata (4.7 kB) Collecting websocket-client!=0.40.0,!=0.41.*,!=0.42.*,>=0.32.0 (from kubernetes>=28.1.0->chromadb) Using cached websocket_client-1.7.0-py3-none-any.whl.metadata (7.9 kB) Collecting requests-oauthlib (from kubernetes>=28.1.0->chromadb) Using cached requests_oauthlib-2.0.0-py2.py3-none-any.whl.metadata (11 kB) Collecting oauthlib>=3.2.2 (from kubernetes>=28.1.0->chromadb) Using cached oauthlib-3.2.2-py3-none-any.whl.metadata (7.5 kB) Requirement already satisfied: urllib3>=1.24.2 in e:\anaconda3\envs\openai\lib\site-packages (from kubernetes>=28.1.0->chromadb) (2.2.1) Collecting coloredlogs (from onnxruntime>=1.14.1->chromadb) Using cached coloredlogs-15.0.1-py2.py3-none-any.whl.metadata (12 kB) Collecting flatbuffers (from onnxruntime>=1.14.1->chromadb) Using cached flatbuffers-24.3.25-py2.py3-none-any.whl.metadata (850 bytes) Requirement already satisfied: protobuf in e:\anaconda3\envs\openai\lib\site-packages (from onnxruntime>=1.14.1->chromadb) (4.25.3) Requirement already satisfied: sympy in e:\anaconda3\envs\openai\lib\site-packages (from onnxruntime>=1.14.1->chromadb) (1.12) Collecting deprecated>=1.2.6 (from opentelemetry-api>=1.2.0->chromadb) Using cached Deprecated-1.2.14-py2.py3-none-any.whl.metadata (5.4 kB) Collecting importlib-metadata>=4.6 (from build>=1.0.3->chromadb) Using cached importlib_metadata-7.0.0-py3-none-any.whl.metadata (4.9 kB) Collecting googleapis-common-protos~=1.52 (from opentelemetry-exporter-otlp-proto-grpc>=1.2.0->chromadb) Using cached googleapis_common_protos-1.63.0-py2.py3-none-any.whl.metadata (1.5 kB) Collecting opentelemetry-exporter-otlp-proto-common==1.24.0 (from opentelemetry-exporter-otlp-proto-grpc>=1.2.0->chromadb) Using cached opentelemetry_exporter_otlp_proto_common-1.24.0-py3-none-any.whl.metadata (1.7 kB) Collecting opentelemetry-proto==1.24.0 (from opentelemetry-exporter-otlp-proto-grpc>=1.2.0->chromadb) Using cached opentelemetry_proto-1.24.0-py3-none-any.whl.metadata (2.2 kB) Collecting opentelemetry-instrumentation-asgi==0.45b0 (from opentelemetry-instrumentation-fastapi>=0.41b0->chromadb) Using cached opentelemetry_instrumentation_asgi-0.45b0-py3-none-any.whl.metadata (1.9 kB) Collecting opentelemetry-instrumentation==0.45b0 (from opentelemetry-instrumentation-fastapi>=0.41b0->chromadb) Using cached opentelemetry_instrumentation-0.45b0-py3-none-any.whl.metadata (6.1 kB) Collecting opentelemetry-semantic-conventions==0.45b0 (from opentelemetry-instrumentation-fastapi>=0.41b0->chromadb) Using cached opentelemetry_semantic_conventions-0.45b0-py3-none-any.whl.metadata (2.2 kB) Collecting opentelemetry-util-http==0.45b0 (from opentelemetry-instrumentation-fastapi>=0.41b0->chromadb) Using cached opentelemetry_util_http-0.45b0-py3-none-any.whl.metadata (2.4 kB) Requirement already satisfied: setuptools>=16.0 in e:\anaconda3\envs\openai\lib\site-packages (from opentelemetry-instrumentation==0.45b0->opentelemetry-instrumentation-fastapi>=0.41b0->chromadb) (68.2.2) Requirement already satisfied: wrapt<2.0.0,>=1.0.0 in e:\anaconda3\envs\openai\lib\site-packages (from opentelemetry-instrumentation==0.45b0->opentelemetry-instrumentation-fastapi>=0.41b0->chromadb) (1.16.0) Collecting asgiref~=3.0 (from opentelemetry-instrumentation-asgi==0.45b0->opentelemetry-instrumentation-fastapi>=0.41b0->chromadb) Using cached asgiref-3.8.1-py3-none-any.whl.metadata (9.3 kB) Collecting monotonic>=1.5 (from posthog>=2.4.0->chromadb) Using cached monotonic-1.6-py2.py3-none-any.whl.metadata (1.5 kB) Requirement already satisfied: backoff>=1.10.0 in e:\anaconda3\envs\openai\lib\site-packages (from posthog>=2.4.0->chromadb) (2.2.1) Requirement already satisfied: annotated-types>=0.4.0 in e:\anaconda3\envs\openai\lib\site-packages (from pydantic>=1.9->chromadb) (0.6.0) Requirement already satisfied: pydantic-core==2.16.3 in e:\anaconda3\envs\openai\lib\site-packages (from pydantic>=1.9->chromadb) (2.16.3) Requirement already satisfied: charset-normalizer<4,>=2 in e:\anaconda3\envs\openai\lib\site-packages (from requests>=2.28->chromadb) (3.3.2) Requirement already satisfied: idna<4,>=2.5 in e:\anaconda3\envs\openai\lib\site-packages (from requests>=2.28->chromadb) (3.6) Requirement already satisfied: huggingface_hub<1.0,>=0.16.4 in e:\anaconda3\envs\openai\lib\site-packages (from tokenizers>=0.13.2->chromadb) (0.22.2) Collecting typer-slim==0.12.0 (from typer-slim[standard]==0.12.0->typer>=0.9.0->chromadb) Using cached typer_slim-0.12.0-py3-none-any.whl.metadata (15 kB) Collecting typer-cli==0.12.0 (from typer>=0.9.0->chromadb) Using cached typer_cli-0.12.0-py3-none-any.whl.metadata (3.5 kB) Requirement already satisfied: click>=8.0.0 in e:\anaconda3\envs\openai\lib\site-packages (from typer-slim==0.12.0->typer-slim[standard]==0.12.0->typer>=0.9.0->chromadb) (8.1.7) Collecting shellingham>=1.3.0 (from typer-slim[standard]==0.12.0->typer>=0.9.0->chromadb) Using cached shellingham-1.5.4-py2.py3-none-any.whl.metadata (3.5 kB) Requirement already satisfied: rich>=10.11.0 in e:\anaconda3\envs\openai\lib\site-packages (from typer-slim[standard]==0.12.0->typer>=0.9.0->chromadb) (13.7.1) Requirement already satisfied: h11>=0.8 in e:\anaconda3\envs\openai\lib\site-packages (from uvicorn>=0.18.3->uvicorn[standard]>=0.18.3->chromadb) (0.14.0) Collecting httptools>=0.5.0 (from uvicorn[standard]>=0.18.3->chromadb) Using cached httptools-0.6.1-cp38-cp38-win_amd64.whl.metadata (3.7 kB) Collecting python-dotenv>=0.13 (from uvicorn[standard]>=0.18.3->chromadb) Using cached python_dotenv-1.0.1-py3-none-any.whl.metadata (23 kB) Collecting watchfiles>=0.13 (from uvicorn[standard]>=0.18.3->chromadb) Using cached watchfiles-0.21.0-cp38-none-win_amd64.whl.metadata (5.0 kB) Collecting websockets>=10.4 (from uvicorn[standard]>=0.18.3->chromadb) Using cached websockets-12.0-cp38-cp38-win_amd64.whl.metadata (6.8 kB) Requirement already satisfied: zipp>=3.1.0 in e:\anaconda3\envs\openai\lib\site-packages (from importlib-resources->chromadb) (3.17.0) Requirement already satisfied: cachetools<6.0,>=2.0.0 in e:\anaconda3\envs\openai\lib\site-packages (from google-auth>=1.0.1->kubernetes>=28.1.0->chromadb) (5.3.3) Collecting pyasn1-modules>=0.2.1 (from google-auth>=1.0.1->kubernetes>=28.1.0->chromadb) Using cached pyasn1_modules-0.4.0-py3-none-any.whl.metadata (3.4 kB) Collecting rsa<5,>=3.1.4 (from google-auth>=1.0.1->kubernetes>=28.1.0->chromadb) Using cached rsa-4.9-py3-none-any.whl.metadata (4.2 kB) Requirement already satisfied: filelock in e:\anaconda3\envs\openai\lib\site-packages (from huggingface_hub<1.0,>=0.16.4->tokenizers>=0.13.2->chromadb) (3.13.3) Requirement already satisfied: fsspec>=2023.5.0 in e:\anaconda3\envs\openai\lib\site-packages (from huggingface_hub<1.0,>=0.16.4->tokenizers>=0.13.2->chromadb) (2024.3.1) Requirement already satisfied: anyio<5,>=3.4.0 in e:\anaconda3\envs\openai\lib\site-packages (from starlette<0.37.0,>=0.36.3->fastapi>=0.95.2->chromadb) (4.3.0) Collecting humanfriendly>=9.1 (from coloredlogs->onnxruntime>=1.14.1->chromadb) Using cached humanfriendly-10.0-py2.py3-none-any.whl.metadata (9.2 kB) Requirement already satisfied: mpmath>=0.19 in e:\anaconda3\envs\openai\lib\site-packages (from sympy->onnxruntime>=1.14.1->chromadb) (1.3.0) Requirement already satisfied: sniffio>=1.1 in e:\anaconda3\envs\openai\lib\site-packages (from anyio<5,>=3.4.0->starlette<0.37.0,>=0.36.3->fastapi>=0.95.2->chromadb) (1.3.1) Requirement already satisfied: exceptiongroup>=1.0.2 in e:\anaconda3\envs\openai\lib\site-packages (from anyio<5,>=3.4.0->starlette<0.37.0,>=0.36.3->fastapi>=0.95.2->chromadb) (1.2.0) Collecting pyreadline3 (from humanfriendly>=9.1->coloredlogs->onnxruntime>=1.14.1->chromadb) Using cached pyreadline3-3.4.1-py3-none-any.whl.metadata (2.0 kB) Collecting pyasn1<0.7.0,>=0.4.6 (from pyasn1-modules>=0.2.1->google-auth>=1.0.1->kubernetes>=28.1.0->chromadb) Using cached pyasn1-0.6.0-py2.py3-none-any.whl.metadata (8.3 kB) Requirement already satisfied: markdown-it-py>=2.2.0 in e:\anaconda3\envs\openai\lib\site-packages (from rich>=10.11.0->typer-slim[standard]==0.12.0->typer>=0.9.0->chromadb) (3.0.0) Requirement already satisfied: pygments<3.0.0,>=2.13.0 in e:\anaconda3\envs\openai\lib\site-packages (from rich>=10.11.0->typer-slim[standard]==0.12.0->typer>=0.9.0->chromadb) (2.15.1) Requirement already satisfied: mdurl~=0.1 in e:\anaconda3\envs\openai\lib\site-packages (from markdown-it-py>=2.2.0->rich>=10.11.0->typer-slim[standard]==0.12.0->typer>=0.9.0->chromadb) (0.1.2) Using cached chromadb-0.4.24-py3-none-any.whl (525 kB) Using cached chroma_hnswlib-0.7.3-cp38-cp38-win_amd64.whl (150 kB) Using cached bcrypt-4.1.2-cp37-abi3-win_amd64.whl (158 kB) Using cached build-1.2.1-py3-none-any.whl (21 kB) Using cached fastapi-0.110.0-py3-none-any.whl (92 kB) Using cached graphlib_backport-1.1.0-py3-none-any.whl (7.1 kB) Using cached grpcio-1.62.1-cp38-cp38-win_amd64.whl (3.8 MB) Using cached kubernetes-29.0.0-py2.py3-none-any.whl (1.6 MB) Using cached mmh3-4.1.0-cp38-cp38-win_amd64.whl (31 kB) Using cached onnxruntime-1.17.1-cp38-cp38-win_amd64.whl (5.6 MB) Using cached opentelemetry_api-1.24.0-py3-none-any.whl (60 kB) Using cached opentelemetry_exporter_otlp_proto_grpc-1.24.0-py3-none-any.whl (18 kB) Using cached opentelemetry_exporter_otlp_proto_common-1.24.0-py3-none-any.whl (17 kB) Using cached opentelemetry_proto-1.24.0-py3-none-any.whl (50 kB) Using cached opentelemetry_instrumentation_fastapi-0.45b0-py3-none-any.whl (11 kB) Using cached opentelemetry_instrumentation-0.45b0-py3-none-any.whl (28 kB) Using cached opentelemetry_instrumentation_asgi-0.45b0-py3-none-any.whl (14 kB) Using cached opentelemetry_semantic_conventions-0.45b0-py3-none-any.whl (36 kB) Using cached opentelemetry_util_http-0.45b0-py3-none-any.whl (6.9 kB) Using cached opentelemetry_sdk-1.24.0-py3-none-any.whl (106 kB) Using cached overrides-7.7.0-py3-none-any.whl (17 kB) Using cached posthog-3.5.0-py2.py3-none-any.whl (41 kB) Using cached pulsar_client-3.4.0-cp38-cp38-win_amd64.whl (3.4 MB) Using cached typer-0.12.0-py3-none-any.whl (5.6 kB) Using cached typer_cli-0.12.0-py3-none-any.whl (3.0 kB) Using cached typer_slim-0.12.0-py3-none-any.whl (46 kB) Using cached uvicorn-0.29.0-py3-none-any.whl (60 kB) Using cached Deprecated-1.2.14-py2.py3-none-any.whl (9.6 kB) Using cached google_auth-2.29.0-py2.py3-none-any.whl (189 kB) Using cached googleapis_common_protos-1.63.0-py2.py3-none-any.whl (229 kB) Using cached httptools-0.6.1-cp38-cp38-win_amd64.whl (60 kB) Using cached importlib_metadata-7.0.0-py3-none-any.whl (23 kB) Using cached monotonic-1.6-py2.py3-none-any.whl (8.2 kB) Using cached oauthlib-3.2.2-py3-none-any.whl (151 kB) Using cached python_dotenv-1.0.1-py3-none-any.whl (19 kB) Using cached starlette-0.36.3-py3-none-any.whl (71 kB) Using cached tomli-2.0.1-py3-none-any.whl (12 kB) Using cached watchfiles-0.21.0-cp38-none-win_amd64.whl (279 kB) Using cached websocket_client-1.7.0-py3-none-any.whl (58 kB) Using cached websockets-12.0-cp38-cp38-win_amd64.whl (124 kB) Using cached coloredlogs-15.0.1-py2.py3-none-any.whl (46 kB) Using cached flatbuffers-24.3.25-py2.py3-none-any.whl (26 kB) Using cached pyproject_hooks-1.0.0-py3-none-any.whl (9.3 kB) Using cached requests_oauthlib-2.0.0-py2.py3-none-any.whl (24 kB) Using cached asgiref-3.8.1-py3-none-any.whl (23 kB) Using cached humanfriendly-10.0-py2.py3-none-any.whl (86 kB) Using cached pyasn1_modules-0.4.0-py3-none-any.whl (181 kB) Using cached rsa-4.9-py3-none-any.whl (34 kB) Using cached shellingham-1.5.4-py2.py3-none-any.whl (9.8 kB) Using cached pyasn1-0.6.0-py2.py3-none-any.whl (85 kB) Using cached pyreadline3-3.4.1-py3-none-any.whl (95 kB) Installing collected packages: pyreadline3, pypika, monotonic, mmh3, flatbuffers, websockets, websocket-client, tomli, shellingham, python-dotenv, pyasn1, pulsar-client, overrides, opentelemetry-util-http, opentelemetry-semantic-conventions, opentelemetry-proto, oauthlib, importlib-metadata, humanfriendly, httptools, grpcio, graphlib-backport, googleapis-common-protos, deprecated, chroma-hnswlib, bcrypt, asgiref, watchfiles, uvicorn, typer-slim, starlette, rsa, requests-oauthlib, pyproject_hooks, pyasn1-modules, posthog, opentelemetry-exporter-otlp-proto-common, opentelemetry-api, coloredlogs, opentelemetry-sdk, opentelemetry-instrumentation, onnxruntime, google-auth, fastapi, build, typer-cli, opentelemetry-instrumentation-asgi, opentelemetry-exporter-otlp-proto-grpc, kubernetes, typer, opentelemetry-instrumentation-fastapi, chromadb Attempting uninstall: importlib-metadata Found existing installation: importlib-metadata 7.0.1 Uninstalling importlib-metadata-7.0.1: Successfully uninstalled importlib-metadata-7.0.1 Successfully installed asgiref-3.8.1 bcrypt-4.1.2 build-1.2.1 chroma-hnswlib-0.7.3 chromadb-0.4.24 coloredlogs-15.0.1 deprecated-1.2.14 fastapi-0.110.0 flatbuffers-24.3.25 google-auth-2.29.0 googleapis-common-protos-1.63.0 graphlib-backport-1.1.0 grpcio-1.62.1 httptools-0.6.1 humanfriendly-10.0 importlib-metadata-7.0.0 kubernetes-29.0.0 mmh3-4.1.0 monotonic-1.6 oauthlib-3.2.2 onnxruntime-1.17.1 opentelemetry-api-1.24.0 opentelemetry-exporter-otlp-proto-common-1.24.0 opentelemetry-exporter-otlp-proto-grpc-1.24.0 opentelemetry-instrumentation-0.45b0 opentelemetry-instrumentation-asgi-0.45b0 opentelemetry-instrumentation-fastapi-0.45b0 opentelemetry-proto-1.24.0 opentelemetry-sdk-1.24.0 opentelemetry-semantic-conventions-0.45b0 opentelemetry-util-http-0.45b0 overrides-7.7.0 posthog-3.5.0 pulsar-client-3.4.0 pyasn1-0.6.0 pyasn1-modules-0.4.0 pypika-0.48.9 pyproject_hooks-1.0.0 pyreadline3-3.4.1 python-dotenv-1.0.1 requests-oauthlib-2.0.0 rsa-4.9 shellingham-1.5.4 starlette-0.36.3 tomli-2.0.1 typer-0.12.0 typer-cli-0.12.0 typer-slim-0.12.0 uvicorn-0.29.0 watchfiles-0.21.0 websocket-client-1.7.0 websockets-12.0
WARNING: Ignoring invalid distribution -pype1 (c:\users\jyseo\appdata\roaming\python\python38\site-packages) WARNING: Ignoring invalid distribution -pype1 (c:\users\jyseo\appdata\roaming\python\python38\site-packages) WARNING: Ignoring invalid distribution -pype1 (c:\users\jyseo\appdata\roaming\python\python38\site-packages) WARNING: Ignoring invalid distribution -pype1 (c:\users\jyseo\appdata\roaming\python\python38\site-packages) WARNING: Ignoring invalid distribution -pype1 (c:\users\jyseo\appdata\roaming\python\python38\site-packages) WARNING: Ignoring invalid distribution -pype1 (c:\users\jyseo\appdata\roaming\python\python38\site-packages) WARNING: Ignoring invalid distribution -pype1 (c:\users\jyseo\appdata\roaming\python\python38\site-packages)
!pip install openai
Requirement already satisfied: openai in e:\anaconda3\envs\openai\lib\site-packages (1.14.3) Requirement already satisfied: anyio<5,>=3.5.0 in e:\anaconda3\envs\openai\lib\site-packages (from openai) (4.3.0) Requirement already satisfied: distro<2,>=1.7.0 in e:\anaconda3\envs\openai\lib\site-packages (from openai) (1.9.0) Requirement already satisfied: httpx<1,>=0.23.0 in e:\anaconda3\envs\openai\lib\site-packages (from openai) (0.27.0) Requirement already satisfied: pydantic<3,>=1.9.0 in e:\anaconda3\envs\openai\lib\site-packages (from openai) (2.6.4) Requirement already satisfied: sniffio in e:\anaconda3\envs\openai\lib\site-packages (from openai) (1.3.1) Requirement already satisfied: tqdm>4 in e:\anaconda3\envs\openai\lib\site-packages (from openai) (4.66.2) Requirement already satisfied: typing-extensions<5,>=4.7 in e:\anaconda3\envs\openai\lib\site-packages (from openai) (4.9.0) Requirement already satisfied: idna>=2.8 in e:\anaconda3\envs\openai\lib\site-packages (from anyio<5,>=3.5.0->openai) (3.6) Requirement already satisfied: exceptiongroup>=1.0.2 in e:\anaconda3\envs\openai\lib\site-packages (from anyio<5,>=3.5.0->openai) (1.2.0) Requirement already satisfied: certifi in e:\anaconda3\envs\openai\lib\site-packages (from httpx<1,>=0.23.0->openai) (2024.2.2) Requirement already satisfied: httpcore==1.* in e:\anaconda3\envs\openai\lib\site-packages (from httpx<1,>=0.23.0->openai) (1.0.5) Requirement already satisfied: h11<0.15,>=0.13 in e:\anaconda3\envs\openai\lib\site-packages (from httpcore==1.*->httpx<1,>=0.23.0->openai) (0.14.0) Requirement already satisfied: annotated-types>=0.4.0 in e:\anaconda3\envs\openai\lib\site-packages (from pydantic<3,>=1.9.0->openai) (0.6.0) Requirement already satisfied: pydantic-core==2.16.3 in e:\anaconda3\envs\openai\lib\site-packages (from pydantic<3,>=1.9.0->openai) (2.16.3) Requirement already satisfied: colorama in e:\anaconda3\envs\openai\lib\site-packages (from tqdm>4->openai) (0.4.6)
WARNING: Ignoring invalid distribution -pype1 (c:\users\jyseo\appdata\roaming\python\python38\site-packages) WARNING: Ignoring invalid distribution -pype1 (c:\users\jyseo\appdata\roaming\python\python38\site-packages) WARNING: Ignoring invalid distribution -pype1 (c:\users\jyseo\appdata\roaming\python\python38\site-packages) WARNING: Ignoring invalid distribution -pype1 (c:\users\jyseo\appdata\roaming\python\python38\site-packages) WARNING: Ignoring invalid distribution -pype1 (c:\users\jyseo\appdata\roaming\python\python38\site-packages) WARNING: Ignoring invalid distribution -pype1 (c:\users\jyseo\appdata\roaming\python\python38\site-packages)
from langchain.document_loaders import TextLoader
documents = TextLoader("e:/data/AI.txt").load()
from langchain.text_splitter import RecursiveCharacterTextSplitter
# 문서를 청크로 분할
def split_docs(documents,chunk_size=1000,chunk_overlap=20):
text_splitter = RecursiveCharacterTextSplitter(chunk_size=chunk_size, chunk_overlap=chunk_overlap)
docs = text_splitter.split_documents(documents)
return docs
# docs 변수에 분할 문서를 저장
docs = split_docs(documents)
from langchain.embeddings import SentenceTransformerEmbeddings
embeddings = SentenceTransformerEmbeddings(model_name="all-MiniLM-L6-v2")
# Chromdb에 벡터 저장
from langchain.vectorstores import Chroma
db = Chroma.from_documents(docs, embeddings)
e:\anaconda3\envs\openai\lib\site-packages\tqdm\auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html from .autonotebook import tqdm as notebook_tqdm
import os #운영체제(os) 모듈을 가져올 때 사용되는 라이브러리
os.environ["OPENAI_API_KEY"] = "sk-" #openai 키 입력
from langchain.chat_models import ChatOpenAI
model_name = "gpt-3.5-turbo" #GPT-3.5 turbo 모델 사용
llm = ChatOpenAI(model_name=model_name)
# Q&A 체인을 사용하여 쿼리에 대한 답변 얻기
from langchain.chains.question_answering import load_qa_chain
chain = load_qa_chain(llm, chain_type="stuff",verbose=True)
# 쿼리를 작성하고 유사성 검색을 수행하여 답변을 생성,따라서 txt에 있는 내용을 질의해야 합니다
query = "AI란?"
matching_docs = db.similarity_search(query)
answer = chain.run(input_documents=matching_docs, question=query)
answer
e:\anaconda3\envs\openai\lib\site-packages\langchain_core\_api\deprecation.py:117: LangChainDeprecationWarning: The class `langchain_community.chat_models.openai.ChatOpenAI` was deprecated in langchain-community 0.0.10 and will be removed in 0.2.0. An updated version of the class exists in the langchain-openai package and should be used instead. To use it run `pip install -U langchain-openai` and import as `from langchain_openai import ChatOpenAI`. warn_deprecated( Number of requested results 4 is greater than number of elements in index 3, updating n_results = 3 e:\anaconda3\envs\openai\lib\site-packages\langchain_core\_api\deprecation.py:117: LangChainDeprecationWarning: The function `run` was deprecated in LangChain 0.1.0 and will be removed in 0.2.0. Use invoke instead. warn_deprecated(
> Entering new StuffDocumentsChain chain... > Entering new LLMChain chain... Prompt after formatting: System: Use the following pieces of context to answer the user's question. If you don't know the answer, just say that you don't know, don't try to make up an answer. ---------------- Artificial intelligence (AI) is the intelligence of machines or software, as opposed to the intelligence of humans or animals. It is a field of study in computer science that develops and studies intelligent machines. Such machines may be called AIs. AI technology is widely used throughout industry, government, and science. Some high-profile applications are: advanced web search engines (e.g., Google Search), recommendation systems (used by YouTube, Amazon, and Netflix), understanding human speech (such as Google Assistant, Siri, and Alexa), self-driving cars (e.g., Waymo), generative and creative tools (ChatGPT and AI art), and superhuman play and analysis in strategy games (such as chess and Go).[1] The various sub-fields of AI research are centered around particular goals and the use of particular tools. The traditional goals of AI research include reasoning, knowledge representation, planning, learning, natural language processing, perception, and support for robotics.[a] General intelligence (the ability to complete any task performable by a human) is among the field's long-term goals.[11] To solve these problems, AI researchers have adapted and integrated a wide range of problem-solving techniques, including search and mathematical optimization, formal logic, artificial neural networks, and methods based on statistics, operations research, and economics.[b] AI also draws upon psychology, linguistics, philosophy, neuroscience and other fields.[12] Alan Turing was the first person to carry out substantial research in the field that he called Machine Intelligence.[2] Artificial intelligence was founded as an academic discipline in 1956.[3] The field went through multiple cycles of optimism[4][5] followed by disappointment and loss of funding.[6][7] Funding and interest vastly increased after 2012 when deep learning surpassed all previous AI techniques,[8] and after 2017 with the transformer architecture.[9] This led to the AI spring of the 2020s, with companies, universities, and laboratories overwhelmingly based in the United States pioneering significant advances in artificial intelligence.[10] Human: AI란? > Finished chain. > Finished chain.
'AI는 인간이나 동물의 지능과는 달리 기계나 소프트웨어의 지능을 의미합니다. 컴퓨터 과학 분야에서 개발되고 연구되는 AI는 인공 지능이라고 불리기도 합니다. AI 기술은 산업, 정부, 과학 분야에서 널리 사용되며 구글 검색, 유튜브, 아마존, 넷플릭스의 추천 시스템, 구글 어시스턴트, 시리, 알렉사와 같은 음성 인식, 웨이모와 같은 자율 주행 자동차, ChatGPT와 AI 아트와 같은 창조적인 도구, 체스와 바둑과 같은 전략 게임에서의 초인간 수준의 플레이와 분석 등이 대표적인 응용 분야입니다. AI 연구의 여러 하위 분야는 특정한 목표와 도구의 사용을 중심으로 한다고 합니다. Alan Turing은 기계 지능이라고 부르는 분야에서 중요한 연구를 수행한 첫 번째 사람으로 인공 지능은 1956년 학문 분야로 성립되었습니다. 이후 2012년 이후 딥 러닝이 이전의 모든 AI 기술을 능가하면서 AI에 대한 흥미와 투자가 크게 증가했으며 2017년 이후 transformer 아키텍처가 등장하면서 AI 분야에서 큰 발전이 이루어졌습니다. 이로 인해 2020년대 AI 분야에서 미국을 중심으로 한 기업, 대학, 연구소가 중요한 발전을 이뤄내고 있습니다.'