Skip to content

SDK Documentation

This is the documentation for Whitebox's SDK. For an interactive experience, you can expirement with the SDK's Jupyter notebooks.

Models

create_model(name, type, prediction, labels=None, description="")

Creates a model in the database. This model works as placeholder for all the actual model's metadata.

Parameter Type Description
name str The name of the model.
type str The model's type. Possible values: binary, multi_class, regression.
prediction str The prediction of the model.
labels Dict[str, int] The model's labels. Defaults to None.
description str The model's description. Defaults to an empty string "".

Info

Labels are not applicable ONLY in regression models.

get_model(model_id)

Fetches the model with the specified ID from the database.

Parameter Type Description
model_id str The ID of the model.

delete_model(model_id)

Deletes the model with the specified ID from the database.

Parameter Type Description
model_id str The ID of the model.

Training Datasets

log_training_dataset(model_id, non_processed, processed)

Inserts a set of dataset rows into the database. When the dataset rows are successfully saved, the pipeline for training the model is triggered. Then, the trained model is saved in the /models/your_model's_id folder of whitebox's root directory.

Parameter Type Description
model_id str The ID of the model.
non_processed pd.DataFrame The non processed training dataset.
processed pd.DataFrame The processed training dataset.

Info

The non processed and processed dataframes must have the same length.

Inferences

log_inferences(model_id, non_processed, processed, timestamps, actuals=None)

Inserts a set of inference rows into the database.

Parameter Type Description
model_id str The ID of the model.
non_processed pd.DataFrame The non processed inferences.
processed pd.DataFrame The processed inferences.
timestamps pd.Series The timestamps for each inference row in the inference dataframes.
actuals pd.Series The actuals for each inference row in the inference dataframes. Defaults to None.

Info

The non processed and processed dataframes along with the timestamps and actuals series must ALL have the same length.

get_xai_row(inference_row_id)

Produces an explainability report for a specific inference row.

Parameter Type Description
inference_row_id str The ID of the inference row.

Monitors

create_model_monitor(model_id, name, status, metric, severity, email, feature=None, lower_threshold=None)

Creates a monitor for a specific metric.

Parameter Type Description
model_id str The ID of the model.
name str The name of the monitor.
status MonitorStatus The status of the monitor. Possible values for MonitorStatus: active, inactive.
metric MonitorMetrics The metric that will be monitored. Possible values for MonitorMetrics: accuracy, precision, recall, f1, r_square, mean_squared_error, mean_absolute_error, data_drift, concept_drift.
severity AlertSeverity The severity of the alert the monitor produces. Possible values for AlertSeverity: low, mid, high.
email str The email to which the alert will be sent.
feature str The feature to be monitored. Defaults to None.
lower_threshold float The threshold below which an alert will be produced. Defaults to None.

Note

Some metrics like the data drift don't use a threshold so the feature that will be monitored should be inserted. In any case, both feature and lower_threshold can't be None at the same time.

Metrics

get_drifting_metrics(model_id)

Fetches a model's drifting metric reports.

Parameter Type Description
model_id str The ID of the model.

get_descriptive_statistics(model_id)

Fetches a model's descriptive statistics reports.

Parameter Type Description
model_id str The ID of the model.

get_performance_metrics(model_id)

Fetches a model's performance metric reports.

Parameter Type Description
model_id str The ID of the model.