pip install -U whylogs pandas
Requirement already satisfied: whylogs in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (0.6.5) Requirement already satisfied: pandas in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (1.3.3) Requirement already satisfied: pytz>=2017.3 in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (from pandas) (2021.1) Requirement already satisfied: python-dateutil>=2.7.3 in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (from pandas) (2.8.2) Requirement already satisfied: numpy>=1.17.3 in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (from pandas) (1.21.2) Requirement already satisfied: six>=1.5 in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (from python-dateutil>=2.7.3->pandas) (1.16.0) Requirement already satisfied: tqdm<5.0.0,>=4.60.0 in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (from whylogs) (4.62.3) Requirement already satisfied: xlrd<3.0.0,>=2.0.1 in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (from whylogs) (2.0.1) Requirement already satisfied: whylabs-client in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (from whylogs) (0.1) Requirement already satisfied: smart-open>=4.1.2 in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (from whylogs) (5.2.1) Requirement already satisfied: pyyaml>=5.3.1 in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (from whylogs) (5.4.1) Requirement already satisfied: whylabs-datasketches>=2.2.0b1 in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (from whylogs) (2.2.0b1) Requirement already satisfied: boto3>=1.14.1 in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (from whylogs) (1.18.45) Requirement already satisfied: scikit-learn<0.25.0,>=0.24.2 in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (from whylogs) (0.24.2) Requirement already satisfied: botocore>=1.17.44 in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (from whylogs) (1.21.45) Requirement already satisfied: mlflow<1.14,>=1.13 in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (from whylogs) (1.13.1) Requirement already satisfied: click>=7.1.2 in /Users/andy/.local/lib/python3.8/site-packages (from whylogs) (7.1.2) Requirement already satisfied: puremagic in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (from whylogs) (1.10) Requirement already satisfied: matplotlib<4.0.0,>=3.0.3 in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (from whylogs) (3.4.3) Requirement already satisfied: protobuf>=3.15.5 in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (from whylogs) (3.18.0) Requirement already satisfied: marshmallow>=3.7.1 in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (from whylogs) (3.13.0) Requirement already satisfied: openpyxl<4.0.0,>=3.0.7 in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (from whylogs) (3.0.7) Requirement already satisfied: jmespath<1.0.0,>=0.7.1 in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (from boto3>=1.14.1->whylogs) (0.10.0) Requirement already satisfied: s3transfer<0.6.0,>=0.5.0 in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (from boto3>=1.14.1->whylogs) (0.5.0) Requirement already satisfied: urllib3<1.27,>=1.25.4 in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (from botocore>=1.17.44->whylogs) (1.26.6) Requirement already satisfied: pyparsing>=2.2.1 in /Users/andy/.local/lib/python3.8/site-packages (from matplotlib<4.0.0,>=3.0.3->whylogs) (2.4.7) Requirement already satisfied: pillow>=6.2.0 in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (from matplotlib<4.0.0,>=3.0.3->whylogs) (8.3.2) Requirement already satisfied: cycler>=0.10 in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (from matplotlib<4.0.0,>=3.0.3->whylogs) (0.10.0) Requirement already satisfied: kiwisolver>=1.0.1 in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (from matplotlib<4.0.0,>=3.0.3->whylogs) (1.3.2) Requirement already satisfied: gitpython>=2.1.0 in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (from mlflow<1.14,>=1.13->whylogs) (3.1.24) Requirement already satisfied: docker>=4.0.0 in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (from mlflow<1.14,>=1.13->whylogs) (5.0.2) Requirement already satisfied: querystring-parser in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (from mlflow<1.14,>=1.13->whylogs) (1.2.4) Requirement already satisfied: Flask in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (from mlflow<1.14,>=1.13->whylogs) (2.0.1) Requirement already satisfied: gunicorn in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (from mlflow<1.14,>=1.13->whylogs) (20.1.0) Requirement already satisfied: cloudpickle in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (from mlflow<1.14,>=1.13->whylogs) (2.0.0) Requirement already satisfied: sqlalchemy in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (from mlflow<1.14,>=1.13->whylogs) (1.4.23) Requirement already satisfied: sqlparse>=0.3.1 in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (from mlflow<1.14,>=1.13->whylogs) (0.4.2) Requirement already satisfied: azure-storage-blob>=12.0.0 in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (from mlflow<1.14,>=1.13->whylogs) (12.9.0) Requirement already satisfied: requests>=2.17.3 in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (from mlflow<1.14,>=1.13->whylogs) (2.26.0) Requirement already satisfied: entrypoints in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (from mlflow<1.14,>=1.13->whylogs) (0.3) Requirement already satisfied: prometheus-flask-exporter in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (from mlflow<1.14,>=1.13->whylogs) (0.18.2) Requirement already satisfied: databricks-cli>=0.8.7 in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (from mlflow<1.14,>=1.13->whylogs) (0.15.0) Requirement already satisfied: alembic<=1.4.1 in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (from mlflow<1.14,>=1.13->whylogs) (1.4.1) Requirement already satisfied: Mako in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (from alembic<=1.4.1->mlflow<1.14,>=1.13->whylogs) (1.1.5) Requirement already satisfied: python-editor>=0.3 in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (from alembic<=1.4.1->mlflow<1.14,>=1.13->whylogs) (1.0.4) Requirement already satisfied: msrest>=0.6.21 in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (from azure-storage-blob>=12.0.0->mlflow<1.14,>=1.13->whylogs) (0.6.21) Requirement already satisfied: azure-core<2.0.0,>=1.10.0 in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (from azure-storage-blob>=12.0.0->mlflow<1.14,>=1.13->whylogs) (1.18.0) Requirement already satisfied: cryptography>=2.1.4 in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (from azure-storage-blob>=12.0.0->mlflow<1.14,>=1.13->whylogs) (3.4.8) Requirement already satisfied: cffi>=1.12 in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (from cryptography>=2.1.4->azure-storage-blob>=12.0.0->mlflow<1.14,>=1.13->whylogs) (1.14.6) Requirement already satisfied: pycparser in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (from cffi>=1.12->cryptography>=2.1.4->azure-storage-blob>=12.0.0->mlflow<1.14,>=1.13->whylogs) (2.20) Requirement already satisfied: tabulate>=0.7.7 in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (from databricks-cli>=0.8.7->mlflow<1.14,>=1.13->whylogs) (0.8.9) Requirement already satisfied: websocket-client>=0.32.0 in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (from docker>=4.0.0->mlflow<1.14,>=1.13->whylogs) (1.2.1) Requirement already satisfied: gitdb<5,>=4.0.1 in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (from gitpython>=2.1.0->mlflow<1.14,>=1.13->whylogs) (4.0.7) Requirement already satisfied: typing-extensions>=3.7.4.3 in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (from gitpython>=2.1.0->mlflow<1.14,>=1.13->whylogs) (3.10.0.2) Requirement already satisfied: smmap<5,>=3.0.1 in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (from gitdb<5,>=4.0.1->gitpython>=2.1.0->mlflow<1.14,>=1.13->whylogs) (4.0.0) Requirement already satisfied: certifi>=2017.4.17 in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (from msrest>=0.6.21->azure-storage-blob>=12.0.0->mlflow<1.14,>=1.13->whylogs) (2021.5.30) Requirement already satisfied: requests-oauthlib>=0.5.0 in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (from msrest>=0.6.21->azure-storage-blob>=12.0.0->mlflow<1.14,>=1.13->whylogs) (1.3.0) Requirement already satisfied: isodate>=0.6.0 in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (from msrest>=0.6.21->azure-storage-blob>=12.0.0->mlflow<1.14,>=1.13->whylogs) (0.6.0) Requirement already satisfied: et-xmlfile in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (from openpyxl<4.0.0,>=3.0.7->whylogs) (1.1.0) Requirement already satisfied: idna<4,>=2.5 in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (from requests>=2.17.3->mlflow<1.14,>=1.13->whylogs) (3.2) Requirement already satisfied: charset-normalizer~=2.0.0 in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (from requests>=2.17.3->mlflow<1.14,>=1.13->whylogs) (2.0.6) Requirement already satisfied: oauthlib>=3.0.0 in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (from requests-oauthlib>=0.5.0->msrest>=0.6.21->azure-storage-blob>=12.0.0->mlflow<1.14,>=1.13->whylogs) (3.1.1) Requirement already satisfied: scipy>=0.19.1 in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (from scikit-learn<0.25.0,>=0.24.2->whylogs) (1.7.1) Requirement already satisfied: threadpoolctl>=2.0.0 in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (from scikit-learn<0.25.0,>=0.24.2->whylogs) (2.2.0) Requirement already satisfied: joblib>=0.11 in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (from scikit-learn<0.25.0,>=0.24.2->whylogs) (1.0.1) Requirement already satisfied: greenlet!=0.4.17 in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (from sqlalchemy->mlflow<1.14,>=1.13->whylogs) (1.1.1) Requirement already satisfied: Jinja2>=3.0 in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (from Flask->mlflow<1.14,>=1.13->whylogs) (3.0.1) Requirement already satisfied: Werkzeug>=2.0 in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (from Flask->mlflow<1.14,>=1.13->whylogs) (2.0.1) Requirement already satisfied: itsdangerous>=2.0 in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (from Flask->mlflow<1.14,>=1.13->whylogs) (2.0.1) Requirement already satisfied: MarkupSafe>=2.0 in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (from Jinja2>=3.0->Flask->mlflow<1.14,>=1.13->whylogs) (2.0.1) Requirement already satisfied: setuptools>=3.0 in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (from gunicorn->mlflow<1.14,>=1.13->whylogs) (58.0.4) Requirement already satisfied: prometheus-client in /Users/andy/miniconda3/envs/whylogs/lib/python3.8/site-packages (from prometheus-flask-exporter->mlflow<1.14,>=1.13->whylogs) (0.11.0) Collecting argparse Using cached argparse-1.4.0-py2.py3-none-any.whl (23 kB) Installing collected packages: argparse Successfully installed argparse-1.4.0 Note: you may need to restart the kernel to use updated packages.
import whylogs
import pandas as pd
The example data is prepared from our public S3 bucket. You can use your own data if you want if you have multiple batches of data.
pdfs = []
for i in range(1, 8):
path = f"https://whylabs-public.s3.us-west-2.amazonaws.com/demo_batches/input_batch_{i}.csv"
print(f"Loading data from {path}")
df = pd.read_csv(path)
pdfs.append(df)
Loading data from https://whylabs-public.s3.us-west-2.amazonaws.com/demo_batches/input_batch_1.csv Loading data from https://whylabs-public.s3.us-west-2.amazonaws.com/demo_batches/input_batch_2.csv Loading data from https://whylabs-public.s3.us-west-2.amazonaws.com/demo_batches/input_batch_3.csv Loading data from https://whylabs-public.s3.us-west-2.amazonaws.com/demo_batches/input_batch_4.csv Loading data from https://whylabs-public.s3.us-west-2.amazonaws.com/demo_batches/input_batch_5.csv Loading data from https://whylabs-public.s3.us-west-2.amazonaws.com/demo_batches/input_batch_6.csv Loading data from https://whylabs-public.s3.us-west-2.amazonaws.com/demo_batches/input_batch_7.csv
pdfs[0].describe()
Unnamed: 0 | id | member_id | loan_amnt | funded_amnt | funded_amnt_inv | int_rate | installment | annual_inc | desc | ... | hardship_loan_status | orig_projected_additional_accrued_interest | hardship_payoff_balance_amount | hardship_last_payment_amount | debt_settlement_flag_date | settlement_status | settlement_date | settlement_amount | settlement_percentage | settlement_term | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
count | 407.000000 | 4.070000e+02 | 0.0 | 407.000000 | 407.000000 | 407.000000 | 407.000000 | 407.000000 | 407.000000 | 0.0 | ... | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
mean | 12548.717445 | 1.158631e+08 | NaN | 14203.746929 | 14203.746929 | 14202.948403 | 13.514054 | 418.020344 | 78818.956069 | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
std | 125.354772 | 1.207642e+06 | NaN | 9351.142374 | 9351.142374 | 9350.997874 | 5.446881 | 271.096531 | 55864.939403 | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
min | 12325.000000 | 1.121538e+08 | NaN | 1000.000000 | 1000.000000 | 1000.000000 | 5.320000 | 34.220000 | 0.000000 | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
25% | 12442.500000 | 1.150769e+08 | NaN | 7000.000000 | 7000.000000 | 7000.000000 | 9.930000 | 235.580000 | 43325.000000 | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
50% | 12550.000000 | 1.157004e+08 | NaN | 12000.000000 | 12000.000000 | 12000.000000 | 12.620000 | 357.250000 | 63300.000000 | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
75% | 12653.500000 | 1.168245e+08 | NaN | 20000.000000 | 20000.000000 | 20000.000000 | 16.020000 | 553.515000 | 95000.000000 | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
max | 12862.000000 | 1.181592e+08 | NaN | 40000.000000 | 40000.000000 | 40000.000000 | 30.990000 | 1417.710000 | 495000.000000 | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
8 rows × 126 columns
whylogs
, by default, does not send statistics to WhyLabs.
There are a few small steps you need to set up. If you haven't got the access key, please onboard with WhyLabs.
WhyLabs only requires whylogs API - your raw data never leaves your premise.
from whylogs.app import Session
from whylogs.app.writers import WhyLabsWriter
import os
import datetime
import getpass
# set your org-id here
print("Enter your WhyLabs Org ID")
os.environ["WHYLABS_DEFAULT_ORG_ID"] = input()
# set your API key here
print("Enter your WhyLabs API key")
os.environ["WHYLABS_API_KEY"] = getpass.getpass()
print("Using API Key ID: ", os.environ["WHYLABS_API_KEY"][0:10])
Enter your WhyLabs Org ID
Enter your WhyLabs API key
Using API Key ID: naGzCisIJt
Once the environments are set, let's create a whylogs session with a WhyLabs writer.
Note that you can add your local writer or S3 writer if you want here. Check out the API docs for more information.
# create WhyLabs session
writer = WhyLabsWriter("", formats=[])
session = Session(project="demo-project", pipeline="demo-pipeline", writers=[writer])
Ensure you have a model ID (also called dataset ID) before you start!
dataset_timestamp
parameter, it'll default to UTC
nowclosed
to flush out the dataprint("Enter your model ID from WhyLabs:")
model_id = input()
for i, df in enumerate(pdfs):
# walking backwards. Each dataset has to map to a date to show up as a different batch
# in WhyLabs
dt = datetime.datetime.now(tz=datetime.timezone.utc) - datetime.timedelta(days=i)
# Create new logger for date
with session.logger(tags={"datasetId": model_id}, dataset_timestamp=dt) as ylog:
print("Log data frame for ", dt)
ylog.log_dataframe(df)
Enter your model ID from WhyLabs:
Log data frame for 2021-09-30 04:30:22.845881+00:00 Log data frame for 2021-09-29 04:30:25.273786+00:00
Using API key ID: naGzCisIJt
Log data frame for 2021-09-28 04:30:27.638109+00:00 Log data frame for 2021-09-27 04:30:29.872950+00:00 Log data frame for 2021-09-26 04:30:32.003965+00:00 Log data frame for 2021-09-25 04:30:33.789872+00:00 Log data frame for 2021-09-24 04:30:36.016256+00:00
# Ensure everything is flushed
session.close()