XGBoost integration API reference

API reference for the callback and utilities of the Neptune-XGBoost integration.

You can use the Neptune integration with XGBoost to capture model training metadata withNeptuneCallback.

NeptuneCallback

Neptune callback for logging metadata during XGBoost model training.

Prerequisites

This callback requiresxgboost>=1.3.0.

The callback logs the following:

  • Metrics
  • The pickled model
  • Visualizations (feature importances and trees)
  • If early stopping is activated, best_score and best_iteration are also logged.

The callback works with thexgboost.train()andxgboost.cv()functions, and withmodel.fit()from the scikit-learn API.

Metrics are logged for every dataset in theevalslist and for every metric specified.

Example:Withevals = [(dtrain, "train"), (dval, "valid")]and"eval_metric": ["mae", "rmse"], four metrics are created:

  1. "train/mae"
  2. "train/rmse"
  3. "valid/mae"
  4. "valid/rmse"

Parameters

Name Type Default Description
run RunorHandler - An existing run reference, as returned byneptune.init_run(), or anamespace handler.
base_namespace str, optional "training" Namespace under which all metadata logged by the Neptune callback will be stored.
log_model bool True Whether to log the model as a pickled file at the end of training.
log_importance bool True Whether to log feature importance charts at the end of training.
max_num_features int 10 Max number of top features to log on the importance charts. Works whenlog_importancesis set toTrue. If None or<1, all features will be displayed.For details, seexgboost.plot_importance().
log_tree listofint None Indexes of target trees to log as charts. Requires the Graphviz library to be installed.For details, seexgboost.to_graphviz().
tree_figsize int 30 Controls the size of the visualized tree image. Increase this in case you work with large trees. Works whenlog_treesis notNone.

For details, seexgboost.plot_importance().

For details, seexgboost.to_graphviz().

Examples

Create a Neptune run:

import neptune

run = neptune.init_run()

As a best practice, you should save your Neptune API token and project name as environment variables:

export NEPTUNE_API_TOKEN="h0dHBzOi8aHR0cHM6Lkc78ghs74kl0jv...Yh3Kb8"
export NEPTUNE_PROJECT="ml-team/classification"

Alternatively, you can pass the information when using a function that takesapi_tokenandprojectas arguments:

run = neptune.init_run(
 api_token="h0dHBzOi8aHR0cHM6Lkc78ghs74kl0jv...Yh3Kb8", # (1)!
 project="ml-team/classification", # (2)!
)
  1. In the bottom-left corner, expand the user menu and select Get my API token .
  2. You can copy the path from the project details ( → Details & privacy ).

If you haven't registered, you can log anonymously to a public project:

api_token=neptune.ANONYMOUS_API_TOKEN
project="common/quickstarts"

Make sure not to publish sensitive data through your code!

Create a Neptune callback and pass it toxgb.train():

from neptune.integrations.xgboost import NeptuneCallback

neptune_callback = NeptuneCallback(run=run)

xgb.train( ..., callbacks=[neptune_callback])

When creating the callback, you can specify what you want to log and where:

neptune_callback = NeptuneCallback(
 run=run,
 base_namespace="experiment",
 log_model=False,
 log_tree=[0, 1, 2, 3],
)

See also

neptune-xgboost repo onGitHub



This page is originally sourced from the legacy docs.