XGBoost integration API reference

API reference for the callback and utilities of the Neptune-XGBoost integration.

You can use the Neptune integration with XGBoost to capture model training metadata withNeptuneCallback.

NeptuneCallback

Neptune callback for logging metadata during XGBoost model training.

This callback requiresxgboost>=1.3.0.

The callback logs the following:

Metrics
The pickled model
Visualizations (feature importances and trees)
If early stopping is activated, best_score and best_iteration are also logged.

The callback works with thexgboost.train()andxgboost.cv()functions, and withmodel.fit()from the scikit-learn API.

Metrics are logged for every dataset in theevalslist and for every metric specified.

Example:Withevals = [(dtrain, "train"), (dval, "valid")]and"eval_metric": ["mae", "rmse"], four metrics are created:

Name	Type	Default	Description
run	RunorHandler	-	An existing run reference, as returned byneptune.init_run(), or anamespace handler.
base_namespace	str, optional	"training"	Namespace under which all metadata logged by the Neptune callback will be stored.
log_model	bool	True	Whether to log the model as a pickled file at the end of training.
log_importance	bool	True	Whether to log feature importance charts at the end of training.
max_num_features	int	10	Max number of top features to log on the importance charts. Works whenlog_importancesis set toTrue. If None or<1, all features will be displayed.For details, seexgboost.plot_importance().
log_tree	listofint	None	Indexes of target trees to log as charts. Requires the Graphviz library to be installed.For details, seexgboost.to_graphviz().
tree_figsize	int	30	Controls the size of the visualized tree image. Increase this in case you work with large trees. Works whenlog_treesis notNone.

For details, seexgboost.plot_importance().

For details, seexgboost.to_graphviz().

Create a Neptune run:

import neptune

run = neptune.init_run()

As a best practice, you should save your Neptune API token and project name as environment variables:

export NEPTUNE_API_TOKEN="h0dHBzOi8aHR0cHM6Lkc78ghs74kl0jv...Yh3Kb8"

export NEPTUNE_PROJECT="ml-team/classification"

Alternatively, you can pass the information when using a function that takesapi_tokenandprojectas arguments:

run = neptune.init_run(
 api_token="h0dHBzOi8aHR0cHM6Lkc78ghs74kl0jv...Yh3Kb8", # (1)!
 project="ml-team/classification", # (2)!
)

If you haven't registered, you can log anonymously to a public project:

api_token=neptune.ANONYMOUS_API_TOKEN
project="common/quickstarts"

Make sure not to publish sensitive data through your code!

Create a Neptune callback and pass it toxgb.train():

from neptune.integrations.xgboost import NeptuneCallback

neptune_callback = NeptuneCallback(run=run)

xgb.train( ..., callbacks=[neptune_callback])

When creating the callback, you can specify what you want to log and where:

neptune_callback = NeptuneCallback(
 run=run,
 base_namespace="experiment",
 log_model=False,
 log_tree=[0, 1, 2, 3],
)