LightGBM integration API reference

API reference for the NeptuneCallback class of the Neptune-LightGBM integration.

You can useNeptuneCallbackto capture model training metadata and log model summary after training.

NeptuneCallback

Neptune callback for logging metadata during LightGBM model training.

The callback logs parameters, evaluation results, and info about thetrain_set:

  • feature names
  • number of data points ( num_rows )
  • number of features ( num_features )

Evaluation results are logged separately for everyvalid_sets. For example, with"metric": "logloss"andvalid_names=["train","valid"], two logs are created:train/loglossandvalid/logloss.

The callback works with thelgbm.train()andlgbm.cv()functions, and with the scikit-learn APImodel.fit().

Parameters

Name Type Default Description
run RunorHandler, optional None Existing run reference, as returned byneptune.init_run(), or anamespace handler.
base_namespace str, optional experiment Namespace under which all metadata logged by the Neptune callback will be stored.

Example

Create a Neptune run:

import neptune

run = neptune.init_run()

Instantiate the callback and pass it to training function:

from neptune.integrations.lightgbm import NeptuneCallback

neptune_callback = NeptuneCallback(run=run)
gbm = lgb.train(params, ..., callbacks=[neptune_callback])

As a best practice, you should save your Neptune API token and project name as environment variables:

export NEPTUNE_API_TOKEN="h0dHBzOi8aHR0cHM6Lkc78ghs74kl0jv...Yh3Kb8"
export NEPTUNE_PROJECT="ml-team/classification"

Alternatively, you can pass the information when using a function that takesapi_tokenandprojectas arguments:

run = neptune.init_run(
 api_token="h0dHBzOi8aHR0cHM6Lkc78ghs74kl0jv...Yh3Kb8", # (1)!
 project="ml-team/classification", # (2)!
)
  1. In the bottom-left corner, expand the user menu and select Get my API token .
  2. You can copy the path from the project details ( → Details & privacy ).

If you haven't registered, you can log anonymously to a public project:

api_token=neptune.ANONYMOUS_API_TOKEN
project="common/quickstarts"

Make sure not to publish sensitive data through your code!

create_booster_summary()

Create a model summary after training that can be assigned to the run namespace.

Tip

To have all the information in a single run, you can log the summary to the same run that you used for logging model training.

Parameters

Name Type Default Description
booster lightgbm.Boosterorlightgbm.LGBMModel - The trained LightGBM model.
log_importances bool True Whether to log feature importance charts.
max_num_features int 10 Max number of top features to log on the importance charts. Works whenlog_importancesis set toTrue. If None or<1, all features will be displayed.Seelightgbm.plot_importancefor details.
list_trees listofint None Indices of the target tree to visualize. Works whenlog_treesis set toTrue.
log_trees_as_dataframe bool False Whether to parse the model and log trees in CSV format. Works only forBoosterobjects. Seelightgbm.Booster.trees_to_dataframefor details.
log_pickled_booster bool True Whether to log the model as a pickled file.
log_trees bool False Whether to log visualized trees. This requires theGraphvizlibrary to be installed.
tree_figsize int 30 Controls the size of the visualized tree image. Increase this in case you work with large trees. Works whenlog_treesis set toTrue.
log_confusion_matrix bool False Whether to log confusion matrix. If set toTrue, you need to passy_trueandy_pred.
y_true numpy.array None True labels on the test set. Needed only iflog_confusion_matrixis set toTrue.
y_pred numpy.array None Predictions on the test set. Needed only iflog_confusion_matrixis set toTrue.

Seelightgbm.plot_importancefor details.

Returns

dictwith all metadata, which you can assign to the Neptune run:

run["booster_summary"] = create_booster_summary(...)

Examples

Initialize a Neptune run:

import neptune

run = neptune.init_run(project="workspace-name/project-name") # (1)!
  1. The full project name. For example,"ml-team/classification".
  2. You can copy the name from the project details ( → Details & privacy )
  3. You can also find a pre-filled project string in Experiments → Create a new run .

The full project name. For example,"ml-team/classification".

Train LightGBM model and log booster summary to Neptune:

from neptune.integrations.lightgbm import create_booster_summary

gbm = lgb.train(params, ...)
run["lgbm_summary"] = create_booster_summary(booster=gbm)

You can customize what to log:

run["lgbm_summary"] = create_booster_summary(
 booster=gbm,
 log_trees=True,
 list_trees=[0, 1, 2, 3, 4],
 log_confusion_matrix=True,
 y_pred=y_pred,
 y_true=y_test,
)

In order to log a confusion matrix, the predicted labels and ground truth are required:

y_pred = np.argmax(gbm.predict(X_test), axis=1)
run["lgbm_summary"] = create_booster_summary(
 booster=gbm,
 log_confusion_matrix=True,
 y_pred=y_pred,
 y_true=y_test,
)

See also

neptune-lightgbm repo onGitHub



This page is originally sourced from the legacy docs.