SDK Documentation¶
This is the documentation for Whitebox's SDK. For an interactive experience, you can expirement with the SDK's Jupyter notebooks.
Models¶
create_model(name, type, prediction, labels=None, description="")
Creates a model in the database. This model works as placeholder for all the actual model's metadata.
| Parameter | Type | Description |
|---|---|---|
| name | str |
The name of the model. |
| type | str |
The model's type. Possible values: binary, multi_class, regression. |
| prediction | str |
The prediction of the model. |
| labels | Dict[str, int] |
The model's labels. Defaults to None. |
| description | str |
The model's description. Defaults to an empty string "". |
Info
Labels are not applicable ONLY in regression models.
get_model(model_id)
Fetches the model with the specified ID from the database.
| Parameter | Type | Description |
|---|---|---|
| model_id | str |
The ID of the model. |
delete_model(model_id)
Deletes the model with the specified ID from the database.
| Parameter | Type | Description |
|---|---|---|
| model_id | str |
The ID of the model. |
Training Datasets¶
log_training_dataset(model_id, non_processed, processed)
Inserts a set of dataset rows into the database. When the dataset rows are successfully saved, the pipeline for training the model is triggered. Then, the trained model is saved in the /models/your_model's_id folder of whitebox's root directory.
| Parameter | Type | Description |
|---|---|---|
| model_id | str |
The ID of the model. |
| non_processed | pd.DataFrame |
The non processed training dataset. |
| processed | pd.DataFrame |
The processed training dataset. |
Info
The non processed and processed dataframes must have the same length.
Inferences¶
log_inferences(model_id, non_processed, processed, timestamps, actuals=None)
Inserts a set of inference rows into the database.
| Parameter | Type | Description |
|---|---|---|
| model_id | str |
The ID of the model. |
| non_processed | pd.DataFrame |
The non processed inferences. |
| processed | pd.DataFrame |
The processed inferences. |
| timestamps | pd.Series |
The timestamps for each inference row in the inference dataframes. |
| actuals | pd.Series |
The actuals for each inference row in the inference dataframes. Defaults to None. |
Info
The non processed and processed dataframes along with the timestamps and actuals series must ALL have the same length.
get_xai_row(inference_row_id)
Produces an explainability report for a specific inference row.
| Parameter | Type | Description |
|---|---|---|
| inference_row_id | str |
The ID of the inference row. |
Monitors¶
create_model_monitor(model_id, name, status, metric, severity, email, feature=None, lower_threshold=None)
Creates a monitor for a specific metric.
| Parameter | Type | Description |
|---|---|---|
| model_id | str |
The ID of the model. |
| name | str |
The name of the monitor. |
| status | MonitorStatus |
The status of the monitor. Possible values for MonitorStatus: active, inactive. |
| metric | MonitorMetrics |
The metric that will be monitored. Possible values for MonitorMetrics: accuracy, precision, recall, f1, r_square, mean_squared_error, mean_absolute_error, data_drift, concept_drift. |
| severity | AlertSeverity |
The severity of the alert the monitor produces. Possible values for AlertSeverity: low, mid, high. |
str |
The email to which the alert will be sent. | |
| feature | str |
The feature to be monitored. Defaults to None. |
| lower_threshold | float |
The threshold below which an alert will be produced. Defaults to None. |
Note
Some metrics like the data drift don't use a threshold so the feature that will be monitored should be inserted. In any case, both feature and lower_threshold can't be None at the same time.
Metrics¶
get_drifting_metrics(model_id)
Fetches a model's drifting metric reports.
| Parameter | Type | Description |
|---|---|---|
| model_id | str |
The ID of the model. |
get_descriptive_statistics(model_id)
Fetches a model's descriptive statistics reports.
| Parameter | Type | Description |
|---|---|---|
| model_id | str |
The ID of the model. |
get_performance_metrics(model_id)
Fetches a model's performance metric reports.
| Parameter | Type | Description |
|---|---|---|
| model_id | str |
The ID of the model. |