SDK Documentation¶

This is the documentation for Whitebox's SDK. For an interactive experience, you can expirement with the SDK's Jupyter notebooks.

Models¶

create_model(name, type, prediction, labels=None, description="")

Creates a model in the database. This model works as placeholder for all the actual model's metadata.

Parameter	Type	Description
name	`str`	The name of the model.
type	`str`	The model's type. Possible values: `binary`, `multi_class`, `regression`.
prediction	`str`	The prediction of the model.
labels	`Dict[str, int]`	The model's labels. Defaults to `None`.
description	`str`	The model's description. Defaults to an empty string `""`.

Info

Labels are not applicable ONLY in regression models.

get_model(model_id)

Fetches the model with the specified ID from the database.

Parameter	Type	Description
model_id	`str`	The ID of the model.

delete_model(model_id)

Deletes the model with the specified ID from the database.

Parameter	Type	Description
model_id	`str`	The ID of the model.

Training Datasets¶

log_training_dataset(model_id, non_processed, processed)

Inserts a set of dataset rows into the database. When the dataset rows are successfully saved, the pipeline for training the model is triggered. Then, the trained model is saved in the /models/your_model's_id folder of whitebox's root directory.

Parameter	Type	Description
model_id	`str`	The ID of the model.
non_processed	`pd.DataFrame`	The non processed training dataset.
processed	`pd.DataFrame`	The processed training dataset.

Info

The non processed and processed dataframes must have the same length.

Inferences¶

log_inferences(model_id, non_processed, processed, timestamps, actuals=None)

Inserts a set of inference rows into the database.

Parameter	Type	Description
model_id	`str`	The ID of the model.
non_processed	`pd.DataFrame`	The non processed inferences.
processed	`pd.DataFrame`	The processed inferences.
timestamps	`pd.Series`	The timestamps for each inference row in the inference dataframes.
actuals	`pd.Series`	The actuals for each inference row in the inference dataframes. Defaults to `None`.

Info

The non processed and processed dataframes along with the timestamps and actuals series must ALL have the same length.

get_xai_row(inference_row_id)

Produces an explainability report for a specific inference row.

Parameter	Type	Description
inference_row_id	`str`	The ID of the inference row.

Monitors¶

create_model_monitor(model_id, name, status, metric, severity, email, feature=None, lower_threshold=None)

Creates a monitor for a specific metric.

Parameter	Type	Description
model_id	`str`	The ID of the model.
name	`str`	The name of the monitor.
status	`MonitorStatus`	The status of the monitor. Possible values for `MonitorStatus`: `active`, `inactive`.
metric	`MonitorMetrics`	The metric that will be monitored. Possible values for `MonitorMetrics`: `accuracy`, `precision`, `recall`, `f1`, `r_square`, `mean_squared_error`, `mean_absolute_error`, `data_drift`, `concept_drift`.
severity	`AlertSeverity`	The severity of the alert the monitor produces. Possible values for `AlertSeverity`: `low`, `mid`, `high`.
email	`str`	The email to which the alert will be sent.
feature	`str`	The feature to be monitored. Defaults to `None`.
lower_threshold	`float`	The threshold below which an alert will be produced. Defaults to `None`.

Note

Some metrics like the data drift don't use a threshold so the feature that will be monitored should be inserted. In any case, both feature and lower_threshold can't be None at the same time.

Metrics¶

get_drifting_metrics(model_id)

Fetches a model's drifting metric reports.

Parameter	Type	Description
model_id	`str`	The ID of the model.

get_descriptive_statistics(model_id)

Fetches a model's descriptive statistics reports.

Parameter	Type	Description
model_id	`str`	The ID of the model.

get_performance_metrics(model_id)

Fetches a model's performance metric reports.

Parameter	Type	Description
model_id	`str`	The ID of the model.