LightGBM integration API reference
API reference for the NeptuneCallback class of the Neptune-LightGBM integration.
You can useNeptuneCallbackto capture model training metadata and log model summary after training.
NeptuneCallback
Neptune callback for logging metadata during LightGBM model training.
The callback logs parameters, evaluation results, and info about thetrain_set:
- feature names
- number of data points (
num_rows) - number of features (
num_features)
Evaluation results are logged separately for everyvalid_sets. For example, with"metric": "logloss"andvalid_names=["train","valid"], two logs are created:train/loglossandvalid/logloss.
The callback works with thelgbm.train()andlgbm.cv()functions, and with the scikit-learn APImodel.fit().
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| run | RunorHandler, optional | None | Existing run reference, as returned byneptune.init_run(), or anamespace handler. |
| base_namespace | str, optional | experiment | Namespace under which all metadata logged by the Neptune callback will be stored. |
Example
Create a Neptune run:
import neptune
run = neptune.init_run()
Instantiate the callback and pass it to training function:
from neptune.integrations.lightgbm import NeptuneCallback
neptune_callback = NeptuneCallback(run=run)
gbm = lgb.train(params, ..., callbacks=[neptune_callback])
As a best practice, you should save your Neptune API token and project name as environment variables:
export NEPTUNE_API_TOKEN="h0dHBzOi8aHR0cHM6Lkc78ghs74kl0jv...Yh3Kb8"
export NEPTUNE_PROJECT="ml-team/classification"
Alternatively, you can pass the information when using a function that takesapi_tokenandprojectas arguments:
run = neptune.init_run(
api_token="h0dHBzOi8aHR0cHM6Lkc78ghs74kl0jv...Yh3Kb8", # (1)!
project="ml-team/classification", # (2)!
)
- In the bottom-left corner, expand the user menu and select Get my API token .
- You can copy the path from the project details ( → Details & privacy ).
If you haven't registered, you can log anonymously to a public project:
api_token=neptune.ANONYMOUS_API_TOKEN
project="common/quickstarts"
Make sure not to publish sensitive data through your code!
create_booster_summary()
Create a model summary after training that can be assigned to the run namespace.
Tip
To have all the information in a single run, you can log the summary to the same run that you used for logging model training.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| booster | lightgbm.Boosterorlightgbm.LGBMModel | - | The trained LightGBM model. |
| log_importances | bool | True | Whether to log feature importance charts. |
| max_num_features | int | 10 | Max number of top features to log on the importance charts. Works whenlog_importancesis set toTrue. If None or<1, all features will be displayed.Seelightgbm.plot_importancefor details. |
| list_trees | listofint | None | Indices of the target tree to visualize. Works whenlog_treesis set toTrue. |
| log_trees_as_dataframe | bool | False | Whether to parse the model and log trees in CSV format. Works only forBoosterobjects. Seelightgbm.Booster.trees_to_dataframefor details. |
| log_pickled_booster | bool | True | Whether to log the model as a pickled file. |
| log_trees | bool | False | Whether to log visualized trees. This requires theGraphvizlibrary to be installed. |
| tree_figsize | int | 30 | Controls the size of the visualized tree image. Increase this in case you work with large trees. Works whenlog_treesis set toTrue. |
| log_confusion_matrix | bool | False | Whether to log confusion matrix. If set toTrue, you need to passy_trueandy_pred. |
| y_true | numpy.array | None | True labels on the test set. Needed only iflog_confusion_matrixis set toTrue. |
| y_pred | numpy.array | None | Predictions on the test set. Needed only iflog_confusion_matrixis set toTrue. |
Seelightgbm.plot_importancefor details.
Returns
dictwith all metadata, which you can assign to the Neptune run:
run["booster_summary"] = create_booster_summary(...)
Examples
Initialize a Neptune run:
import neptune
run = neptune.init_run(project="workspace-name/project-name") # (1)!
- The full project name. For example,"ml-team/classification".
- You can copy the name from the project details ( → Details & privacy )
- You can also find a pre-filled
projectstring in Experiments → Create a new run .
The full project name. For example,"ml-team/classification".
Train LightGBM model and log booster summary to Neptune:
from neptune.integrations.lightgbm import create_booster_summary
gbm = lgb.train(params, ...)
run["lgbm_summary"] = create_booster_summary(booster=gbm)
You can customize what to log:
run["lgbm_summary"] = create_booster_summary(
booster=gbm,
log_trees=True,
list_trees=[0, 1, 2, 3, 4],
log_confusion_matrix=True,
y_pred=y_pred,
y_true=y_test,
)
In order to log a confusion matrix, the predicted labels and ground truth are required:
y_pred = np.argmax(gbm.predict(X_test), axis=1)
run["lgbm_summary"] = create_booster_summary(
booster=gbm,
log_confusion_matrix=True,
y_pred=y_pred,
y_true=y_test,
)
See also
neptune-lightgbm repo onGitHub
Related Documentation
This page is originally sourced from the legacy docs.