scikit-learn integration API reference
API reference for neptune--scikit-learn integration.
You can use the Neptune integration with scikit-learn to track your classifiers, regressors, and k-means clustering results.
create_regressor_summary()
Returns a scikit-learn regressor summary that includes:
- All regressor parameters
- Pickled estimator (model)
- Model performance visualizations
The regressor should be fitted before calling this function.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| regressor | regressor | - | Fitted scikit-learn regressor object. |
| X_train | ndarray | - | Training data matrix. |
| X_test | ndarray | - | Testing data matrix. |
| y_train | ndarray | - | The regression target for training. |
| y_test | ndarray | - | The regression target for testing. |
| nrows | int, optional | 1000 | Log firstnrowsrows of test predictions. |
| log_charts | bool, optional | True | If True, calculate and log chart visualizations.This is equivalent to calling thecreate_learning_curve_chart()create_feature_importance_chart(),create_residuals_chart(),create_prediction_error_chart(), andcreate_cooks_distance_chart()functions from this module.Note:Calculating visualizations is potentially expensive depending on input data and regressor, and may take some time to finish. |
This is equivalent to calling thecreate_learning_curve_chart()``create_feature_importance_chart(),create_residuals_chart(),create_prediction_error_chart(), andcreate_cooks_distance_chart()functions from this module.
Note:Calculating visualizations is potentially expensive depending on input data and regressor, and may take some time to finish.
Returns
dictwith all metadata, which can be assigned to a run namespace:
run["summary"] = create_regressor_summary(...)
Example
# Create a run
import neptune
run = neptune.init_run()
# Log random forest regressor summary
rfr = RandomForestRegressor()
rfr.fit(X_train, y_train)
import neptune.integrations.sklearn as npt_utils
run["random_forest/summary"] = npt_utils.create_regressor_summary(
rfr, X_train, X_test, y_train, y_test
)
create_classifier_summary()
Returns a scikit-learn classifier summary that includes:
- All classifier parameters
- Pickled estimator (model)
- Test predictions probabilities
- Model performance visualizations
The classifier should be fitted before calling this function.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| classifier | classifier | - | Fitted scikit-learn classifier object. |
| X_train | ndarray | - | Training data matrix. |
| X_test | ndarray | - | Testing data matrix. |
| y_train | ndarray | - | The classification target for training. |
| y_test | ndarray | - | The classification target for testing. |
| nrows | int, optional | 1000 | Log firstnrowsrows of test predictions and prediction probabilities. |
| log_charts | bool, optional | True | If True, calculate and log chart visualizations.This is equivalent to calling thecreate_classification_report_chart()create_confusion_matrix_chart(),create_roc_auc_chart(),create_prediction_error_chart(),create_precision_recall_chart()andcreate_class_prediction_error_chart()functions from this module.Note:Calculating visualizations is potentially expensive depending on input data and regressor, and may take some time to finish. |
This is equivalent to calling thecreate_classification_report_chart()``create_confusion_matrix_chart(),create_roc_auc_chart(),create_prediction_error_chart(),create_precision_recall_chart()andcreate_class_prediction_error_chart()functions from this module.
Note:Calculating visualizations is potentially expensive depending on input data and regressor, and may take some time to finish.
Returns
dictwith all metadata, which can be assigned to the run namespace:
run["summary"] = create_classifier_summary(...)
Example
# Create a run
import neptune
run = neptune.init_run()
# Log random forest classifier summary
rfc = RandomForestClassifier()
rfc.fit(X_train, y_train)
import neptune.integrations.sklearn as npt_utils
run["random_forest/summary"] = npt_utils.create_classifier_summary(
rfc, X_train, X_test, y_train, y_test
)
create_kmeans_summary()
Returns a scikit-learn k-means summary.
This method fits the k-means model to data and logs:
- All KMeans parameters
- Clustering visualizations: k-means elbow chart and silhouette coefficients chart
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| model | KMeans | - | KMeans object |
| X | ndarray | - | Training instances to cluster |
| nrows | int, optional | 1000 | Number of rows to log in the cluster labels |
| kwargs | - | - | KMeans parameters |
Returns
dictwith all metadata, which can be assigned to a run namespace:run["summary"]=create_kmeans_summary(...)
Example
# Create a run
import neptune
run = neptune.init_run()
# Log random forest classifier summary
km = KMeans(n_init=11, max_iter=270)
X, y = make_blobs(n_samples=579, n_features=17, centers=7, random_state=28743)
import neptune.integrations.sklearn as npt_utils
run["kmeans/summary"] = npt_utils.create_kmeans_summary(km, X)
get_estimator_params()
Get estimator parameters.
Parameters
| Name | Type | Description |
|---|---|---|
| estimator | estimator | Scikit-learn estimator to log parameters for. |
Returns
dictwith all parameters mapped to their values.
Example
# Create a run
import neptune
run = neptune.init_run()
# Log estimator parameters
rfr = RandomForestRegressor()
import neptune.integrations.sklearn as npt_utils
from neptune.utils import stringify_unsupported
run["estimator/params"] = stringify_unsupported(npt_utils.get_estimator_params(rfr))
get_pickled_model()
Get pickled estimator.
Parameters
| Name | Type | Description |
|---|---|---|
| estimator | estimator | Scikit-learn estimator to pickle. |
Returns
Filevalue object with a pickled model that you can log to the run.
Example
# Create a run
import neptune
run = neptune.init_run()
# Log pickled model
rfr = RandomForestRegressor()
import neptune.integrations.sklearn as npt_utils
run["estimator/pickled_model"] = npt_utils.get_pickled_model(rfr)
get_test_preds()
Get test predictions as a table.
If you passy_pred, predictions are not computed fromX_testdata.
The estimator should be fitted before calling this function.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| estimator | estimator | - | scikit-learn estimator to compute predictions. |
| X_test | ndarray | - | Testing data matrix. |
| y_test | ndarray | - | The regression target for testing. |
| y_pred | ndarray, optional | None | Estimator predictions on test data. |
| nrows | int, optional | 1000 | Number of rows to log. |
Returns
Filevalue object with test predictions as a table that you can log to the run.
Example
# Create a run
import neptune
run = neptune.init_run()
# Log test predictions as a table
rfr = RandomForestRegressor()
import neptune.integrations.sklearn as npt_utils
run["estimator/test_preds"] = npt_utils.get_test_preds(rfr, X_test, y_test)
get_test_preds_proba()
Get test prediction probabilities.
- If you pass
X_test, prediction probabilities are computed from data. - If you pass
y_pred_proba, prediction probabilities are not computed fromX_testdata.
The estimator should be fitted before calling this function.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| classifier | classifier | - | scikit-learn classifier to compute prediction probabilities. |
| X_test | ndarray | - | Testing data matrix. |
| y_pred_proba | ndarray, optional | None | Classifier prediction probabilities on test data. |
| nrows | int, optional | 1000 | Number of rows to log. |
Returns
Filevalue object with test prediction probabilities as a table that you can log to the run.
Example
# Create a run
import neptune
run = neptune.init_run()
# Log classifier test predictions probabilities
rfr = RandomForestRegressor()
import neptune.integrations.sklearn as npt_utils
run["estimator/test_preds_proba"] = npt_utils.get_test_preds_proba(rfr, X_test)
get_scores()
Get estimator scores onX.
- If you pass
y_pred, predictions are not computed fromXandydata.
The estimator should be fitted before calling this function.
| Estimator | Logged scores |
|---|---|
| Single output regressors | Explained variance, max error, mean absolute error,(r^2) |
| Multi output regressors | (r^2) |
| Classifiers | Precision, recall,fbeta score, support |
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| estimator | estimator | - | scikit-learn estimator to compute scores. |
| X | ndarray | - | Data matrix. |
| y | ndarray | - | Target for testing. |
| y_pred | ndarray, optional | None | Estimator predictions on data. |
Returns
dictwith scores.
Example
# Create a run
import neptune
run = neptune.init_run()
# Log estimator scores
rfc = RandomForestClassifier()
rfc.fit(X_train, y_train)
import neptune.integrations.sklearn as npt_utils
run["estimator/scores"] = npt_utils.get_scores(rfc, X, y)
create_learning_curve_chart()
Returns a learning curve chart.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| regressor | regressor | - | Fitted scikit-learn regressor object |
| X_train | ndarray | - | Training data matrix |
| y_train | ndarray | - | The regression target for training |
Returns
Filevalue object with a learning curve chart that you can log to the run.
Example
# Create a run
import neptune
run = neptune.init_run()
# Log a learning curve chart
rfr = RandomForestRegressor()
rfr.fit(X_train, y_train)
import neptune.integrations.sklearn as npt_utils
run["visuals/learning_curve"] = npt_utils.create_learning_curve_chart(
rfr, X_train, y_train
)
create_feature_importance_chart()
Returns a feature importance chart.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| regressor | regressor | - | Fitted scikit-learn regressor object |
| X_train | ndarray | - | Training data matrix |
| y_train | ndarray | - | The regression target for training |
Returns
Filevalue object with a feature importance chart that you can log to the run.
Example
# Create a run
import neptune
run = neptune.init_run()
# Log a feature importance chart
rfr = RandomForestRegressor()
rfr.fit(X_train, y_train)
import neptune.integrations.sklearn as npt_utils
run["visuals/feature_importance"] = npt_utils.create_feature_importance_chart(
rfr, X_train, y_train
)
create_residuals_chart()
Returns a residuals chart.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| regressor | regressor | - | Fitted scikit-learn regressor object |
| X_train | ndarray | - | Training data matrix |
| X_test | ndarray | - | Testing data matrix |
| y_train | ndarray | - | The regression target for training |
| y_test | ndarray | - | The regression target for testing |
Returns
Filevalue object with a residuals chart that you can log to the run.
Example
# Create a run
import neptune
run = neptune.init_run()
# Log a residuals chart
rfr = RandomForestRegressor()
rfr.fit(X_train, y_train)
import neptune.integrations.sklearn as npt_utils
run["visuals/residuals"] = npt_utils.create_residuals_chart(
rfr, X_train, X_test, y_train, y_test
)
create_prediction_error_chart()
Returns a prediction error chart.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| regressor | regressor | - | Fitted scikit-learn regressor object |
| X_train | ndarray | - | Training data matrix |
| X_test | ndarray | - | Testing data matrix |
| y_train | ndarray | - | The regression target for training |
| y_test | ndarray | - | The regression target for testing |
Returns
Filevalue object with a prediction error chart that you can log to the run.
Example
# Create a run
import neptune
run = neptune.init_run()
# Log a prediction error chart
rfr = RandomForestRegressor()
rfr.fit(X_train, y_train)
import neptune.integrations.sklearn as npt_utils
run["visuals/prediction_error"] = npt_utils.create_prediction_error_chart(
rfr, X_train, X_test, y_train, y_test
)
create_cooks_distance_chart()
Returns a Cook's distance chart.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| regressor | regressor | - | Fitted scikit-learn regressor object |
| X_train | ndarray | - | Training data matrix |
| y_train | ndarray | - | The regression target for training |
Returns
Filevalue object with a Cook's distance chart that you can log to the run.
Example
# Create a run
import neptune
run = neptune.init_run()
# Log a prediction error chart
rfr = RandomForestRegressor()
rfr.fit(X_train, y_train)
import neptune.integrations.sklearn as npt_utils
run["visuals/cooks_distance"] = npt_utils.create_cooks_distance_chart(
rfr, X_train, y_train
)
create_classification_report_chart()
Returns a classification report chart.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| classifier | classifier | - | Fitted scikit-learn classifier object |
| X_train | ndarray | - | Training data matrix |
| X_test | ndarray | - | Testing data matrix |
| y_train | ndarray | - | The classification target for training |
| y_test | ndarray | - | The classification target for testing |
Returns
Filevalue object with a classification report chart that you can log to the run.
Example
# Create a run
import neptune
run = neptune.init_run()
# Log a classification report chart
rfc = RandomForestClassifier()
rfc.fit(X_train, y_train)
import neptune.integrations.sklearn as npt_utils
run["visuals/cls_report"] = npt_utils.create_classification_report_chart(
rfc, X_train, X_test, y_train, y_test
)
create_confusion_matrix_chart()
Returns a confusion matrix.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| classifier | classifier | - | Fitted scikit-learn classifier object. |
| X_train | ndarray | - | Training data matrix. |
| X_test | ndarray | - | Testing data matrix. |
| y_train | ndarray | - | The classification target for training. |
| y_test | ndarray | - | The classification target for testing. |
Returns
Filevalue object that you can log to the run.
Example
Create a run:
import neptune
run = neptune.init_run()
Log the chart:
rfc = RandomForestClassifier()
rfc.fit(X_train, y_train)
import neptune.integrations.sklearn as npt_utils
run["visuals/confusion_matrix"] = npt_utils.create_confusion_matrix_chart(
rfc, X_train, X_test, y_train, y_test
)
create_roc_auc_chart()
Returns a ROC-AUC chart.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| classifier | classifier | - | Fitted scikit-learn classifier object. |
| X_train | ndarray | - | Training data matrix. |
| X_test | ndarray | - | Testing data matrix. |
| y_train | ndarray | - | The classification target for training. |
| y_test | ndarray | - | The classification target for testing. |
Returns
Filevalue object that you can log to the run.
Example
Create a run:
import neptune
run = neptune.init_run()
Log the chart:
rfc = RandomForestClassifier()
rfc.fit(X_train, y_train)
import neptune.integrations.sklearn as npt_utils
run["visuals/roc_auc"] = npt_utils.create_roc_auc_chart(
rfc, X_train, X_test, y_train, y_test
)
create_precision_recall_chart()
Returns a precision-recall chart.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| classifier | classifier | - | Fitted scikit-learn classifier object. |
| X_test | ndarray | - | Testing data matrix. |
| y_test | ndarray | - | The classification target for testing. |
| y_pred_proba | ndarray | - | Classifier predictions probabilities on test data. |
Returns
Filevalue object that you can log to the run.
Example
Create a run:
import neptune
run = neptune.init_run()
Log the chart:
rfc = RandomForestClassifier()
rfc.fit(X_train, y_train)
import neptune.integrations.sklearn as npt_utils
run["visuals/precision_recall"] = npt_utils.create_precision_recall_chart(
rfc, X_test, y_test
)
create_class_prediction_error_chart()
Returns a class prediction error chart.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| classifier | classifier | - | Fitted scikit-learn classifier object. |
| X_train | ndarray | - | Training data matrix. |
| X_test | ndarray | - | Testing data matrix. |
| y_train | ndarray | - | The classification target for training. |
| y_test | ndarray | - | The classification target for testing. |
Returns
Filevalue object that you can log to the run.
Example
Create a run:
import neptune
run = neptune.init_run()
Log the chart:
rfc = RandomForestClassifier()
rfc.fit(X_train, y_train)
import neptune.integrations.sklearn as npt_utils
run["visuals/class_pred_error"] = npt_utils.create_class_prediction_error_chart(
rfc, X_train, X_test, y_train, y_test
)
get_cluster_labels()
Logs the index of the cluster label each sample belongs to.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| model | KMeans | - | KMeans object. |
| X | ndarray | - | Training instances to cluster. |
| nrows | int, optional | 1000 | Number of rows to log. |
| kwargs | - | - | KMeans parameters. |
Returns
Filevalue object that you can log to the run.
Example
Create a run:
import neptune
run = neptune.init_run()
Log the labels:
km = KMeans(n_init=11, max_iter=270)
X, y = make_blobs(n_samples=579, n_features=17, centers=7, random_state=28743)
import neptune.integrations.sklearn as npt_utils
run["kmeans/cluster_labels"] = npt_utils.get_cluster_labels(km, X)
create_kelbow_chart()
Returns the K-elbow chart for the KMeans clusterer.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| model | KMeans | - | KMeans object. |
| X | ndarray | - | Training instances to cluster. |
| kwargs | - | - | KMeans parameters. |
Returns
Filevalue object that you can log to the run.
Example
Create a run:
import neptune
run = neptune.init_run()
Log the chart:
km = KMeans(n_init=11, max_iter=270)
X, y = make_blobs(n_samples=579, n_features=17, centers=7, random_state=28743)
import neptune.integrations.sklearn as npt_utils
run["kmeans/kelbow"] = npt_utils.create_kelbow_chart(km, X)
create_silhouette_chart()
Returns the silhouette coefficient charts for the KMeans clusterer.
Charts are computed for j = 2, 3, ..., n_clusters.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| model | KMeans | - | KMeans object. |
| X | ndarray | - | Training instances to cluster. |
| kwargs | - | - | KMeans parameters. |
Returns
Filevalue object that you can log to the run.
Example
Create a run:
import neptune
run = neptune.init_run()
Log the charts:
km = KMeans(n_init=11, max_iter=270)
X, y = make_blobs(n_samples=579, n_features=17, centers=7, random_state=28743)
import neptune.integrations.sklearn as npt_utils
run["kmeans/silhouette"] = npt_utils.create_silhouette_chart(km, X, n_clusters=12)
See also
neptune-sklearn onGitHub
Related Documentation
This page is originally sourced from the legacy docs.