Transactions allow several profiles to be commited to WhyLabs as a group. Let's start with some setup.
!pip install whylogs
import whylogs as why
from whylabs_client.api.transactions_api import TransactionsApi
from whylogs.core.schema import DatasetSchema
from whylogs.core.segmentation_partition import segment_on_column
from whylogs.api.writer.whylabs import WhyLabsWriter
from whylogs.api.writer.whylabs_transaction_writer import WhyLabsTransactionWrirter
import os
from uuid import uuid4
from whylogs.datasets import Ecommerce
import numpy as np
import pandas as pd
from datetime import datetime, timedelta, timezone
os.environ["WHYLABS_DEFAULT_ORG_ID"] = "org-XXX"
os.environ["WHYLABS_DEFAULT_DATASET_ID"] = "model-XXX"
os.environ["WHYLABS_API_KEY"] = "XXXX:org-XXX"
dataset = Ecommerce()
daily_batches = dataset.get_inference_data(number_batches=20)
list_daily_batches = list(daily_batches)
columns = ['product','sales_last_week','market_price','rating','category','output_discount','output_prediction','output_score']
df = list_daily_batches[0].data[columns]
df.head()
product | sales_last_week | market_price | rating | category | output_discount | output_prediction | output_score | |
---|---|---|---|---|---|---|---|---|
date | ||||||||
2024-02-23 00:00:00+00:00 | 1-2-3 Noodles - Veg Masala Flavour | 2 | 12.0 | 4.200000 | Snacks and Branded Foods | 0 | 0 | 1.000000 |
2024-02-23 00:00:00+00:00 | Jaggery Powder - Organic, Sulphur Free | 1 | 280.0 | 3.996552 | Gourmet and World Food | 0 | 0 | 0.571833 |
2024-02-23 00:00:00+00:00 | Pudding - Assorted | 3 | 50.0 | 4.400000 | Gourmet and World Food | 0 | 1 | 0.600000 |
2024-02-23 00:00:00+00:00 | Perfectly Moist Dark Chocolate Fudge Cake Mix ... | 1 | 495.0 | 4.000000 | Gourmet and World Food | 0 | 1 | 0.517833 |
2024-02-23 00:00:00+00:00 | Pasta/Spaghetti Spoon - Nylon, Silicon Handle,... | 1 | 299.0 | 3.732046 | Kitchen, Garden and Pets | 1 | 1 | 0.950000 |
why.init(force_local=True)
Initializing session with config /root/.config/whylogs/config.ini ✅ Using session type: LOCAL. Profiles won't be uploaded or written anywhere automatically.
<whylogs.api.whylabs.session.session.LocalSession at 0x7f57fad06ec0>
writer = WhyLabsWriter()
WhyLabsWriter::start_transaction()
signals the start of a transaction. Profiles sent to WhyLabs with WhyLabsWriter::write()
during the transaction are uploaded to WhyLabs immediately, but won't be processed until WhyLabsWriter::commit_transaction()
is called.
transaction_id = writer.start_transaction()
print(f"Started transaction {transaction_id}")
for i in range(5):
batch_df = list_daily_batches[i].data[columns]
profile = why.log(batch_df)
timestamp = datetime.now(tz=timezone.utc) - timedelta(days=i+1)
profile.set_dataset_timestamp(timestamp)
status, id = writer.write(profile)
print(status, id)
writer.commit_transaction()
print("Commiting transaction")
Started transaction df4ff687-f881-4633-8a44-3d1c24f631d3 True log-v12vewf7Cu9j3aVV True log-PWW3D23edKlU0aFt True log-hDYs8dGamli2LHdq True log-4JIe3jWBahpMou07 True log-2bmc0Rl3u4oGBIu8 Commiting transaction
The WhyLabsTransactionWriter
can be used as a context manager to simplify transaction error handling and ensure commit_transaction()
is called.
timestamp = datetime.now(tz=timezone.utc) - timedelta(days=2)
timestamp
datetime.datetime(2024, 2, 20, 0, 14, 58, 753029, tzinfo=datetime.timezone.utc)
try:
with WhyLabsTransactionWriter() as writer:
print("Started transaction")
for i in range(5):
batch_df = list_daily_batches[i].data[columns]
profile = why.log(df)
profile.set_dataset_timestamp(timestamp)
status, id = writer.write(profile)
print(status, id)
except Exception:
print("Transaction failed")
print("Committed transaction")
Started transaction True log-yaHvpXyNRO53ilWo True log-Zsa0lbCCqjzzjLGJ True log-pg57yHO6RuvO4Q8J True log-FSYoOwtmE8x51xSr True log-v3G6VyLUn1x1crVy Committed transaction
If a write()
call during the transaction fails (returns a False
status), the transaction's commit will fail raising an exception.
Each segment in a segmneted profile get uploaded to WhyLabs in a separate S3 interaction. Segmented profiles can be sent as a transaction so that all the segments are committed to WhyLabs at once. In this case, the status returned from WhyLabsWriter::write()
is the logical and of the statuses of each segment, and it returns a list of all the segmented ids.
schema = DatasetSchema(segments=segment_on_column("output_discount"))
profile = why.log(df, schema=schema)
with WhyLabsTransactionWriter() as writer:
status, id = writer.write(profile)
print(f"{status} {id}")
True log-Rhlr7KzY6pp7vla5; log-8ihsF7KAbhAfNFM6