SDK Documentation¶
This is the documentation for Whitebox's SDK. For an interactive experience, you can expirement with the SDK's Jupyter notebooks.
Models¶
create_model(name, type, prediction, labels=None, description="")
Creates a model in the database. This model works as placeholder for all the actual model's metadata.
Parameter | Type | Description |
---|---|---|
name | str |
The name of the model. |
type | str |
The model's type. Possible values: binary , multi_class , regression . |
prediction | str |
The prediction of the model. |
labels | Dict[str, int] |
The model's labels. Defaults to None . |
description | str |
The model's description. Defaults to an empty string "" . |
Info
Labels are not applicable ONLY in regression models.
get_model(model_id)
Fetches the model with the specified ID from the database.
Parameter | Type | Description |
---|---|---|
model_id | str |
The ID of the model. |
delete_model(model_id)
Deletes the model with the specified ID from the database.
Parameter | Type | Description |
---|---|---|
model_id | str |
The ID of the model. |
Training Datasets¶
log_training_dataset(model_id, non_processed, processed)
Inserts a set of dataset rows into the database. When the dataset rows are successfully saved, the pipeline for training the model is triggered. Then, the trained model is saved in the /models/your_model's_id
folder of whitebox's root directory.
Parameter | Type | Description |
---|---|---|
model_id | str |
The ID of the model. |
non_processed | pd.DataFrame |
The non processed training dataset. |
processed | pd.DataFrame |
The processed training dataset. |
Info
The non processed and processed dataframes must have the same length.
Inferences¶
log_inferences(model_id, non_processed, processed, timestamps, actuals=None)
Inserts a set of inference rows into the database.
Parameter | Type | Description |
---|---|---|
model_id | str |
The ID of the model. |
non_processed | pd.DataFrame |
The non processed inferences. |
processed | pd.DataFrame |
The processed inferences. |
timestamps | pd.Series |
The timestamps for each inference row in the inference dataframes. |
actuals | pd.Series |
The actuals for each inference row in the inference dataframes. Defaults to None . |
Info
The non processed and processed dataframes along with the timestamps and actuals series must ALL have the same length.
get_xai_row(inference_row_id)
Produces an explainability report for a specific inference row.
Parameter | Type | Description |
---|---|---|
inference_row_id | str |
The ID of the inference row. |
Monitors¶
create_model_monitor(model_id, name, status, metric, severity, email, feature=None, lower_threshold=None)
Creates a monitor for a specific metric.
Parameter | Type | Description |
---|---|---|
model_id | str |
The ID of the model. |
name | str |
The name of the monitor. |
status | MonitorStatus |
The status of the monitor. Possible values for MonitorStatus : active , inactive . |
metric | MonitorMetrics |
The metric that will be monitored. Possible values for MonitorMetrics : accuracy , precision , recall , f1 , r_square , mean_squared_error , mean_absolute_error , data_drift , concept_drift . |
severity | AlertSeverity |
The severity of the alert the monitor produces. Possible values for AlertSeverity : low , mid , high . |
str |
The email to which the alert will be sent. | |
feature | str |
The feature to be monitored. Defaults to None . |
lower_threshold | float |
The threshold below which an alert will be produced. Defaults to None . |
Note
Some metrics like the data drift don't use a threshold so the feature that will be monitored should be inserted. In any case, both feature
and lower_threshold
can't be None
at the same time.
Metrics¶
get_drifting_metrics(model_id)
Fetches a model's drifting metric reports.
Parameter | Type | Description |
---|---|---|
model_id | str |
The ID of the model. |
get_descriptive_statistics(model_id)
Fetches a model's descriptive statistics reports.
Parameter | Type | Description |
---|---|---|
model_id | str |
The ID of the model. |
get_performance_metrics(model_id)
Fetches a model's performance metric reports.
Parameter | Type | Description |
---|---|---|
model_id | str |
The ID of the model. |