Create a DALEX explainer for h3sdm workflows

Creates a DALEX explainer for a species distribution model fitted with h3sdm_fit_model(). Prepares response and predictor variables, ensuring that all columns used during model training (including h3_address and coordinates) are included. The explainer can be used for feature importance, model residuals, and other DALEX diagnostics.

Usage

h3sdm_explain(model, data, response = "presence", label = "h3sdm workflow")

Arguments

model: A fitted workflow returned by h3sdm_fit_model().
data: A data.frame or sf object containing the original predictors and response variable. If an sf object, geometry is dropped automatically.
response: Character string specifying the name of the response column. Must be a binary factor or numeric vector (0/1). Defaults to "presence".
label: Character string specifying a label for the explainer. Defaults to "h3sdm workflow".

Value

An object of class explainer from the DALEX package, ready to be used with feature_importance(), model_performance(), predict_parts(), and other DALEX functions.

Examples

# \donttest{
library(h3sdm)
library(DALEX)
#> Welcome to DALEX (version: 2.5.3).
#> Find examples and detailed introduction at: http://ema.drwhy.ai/
library(parsnip)

dat <- data.frame(
  x1 = rnorm(20),
  x2 = rnorm(20),
  presence = factor(sample(0:1, 20, replace = TRUE))
)

model <- logistic_reg() |>
  fit(presence ~ x1 + x2, data = dat)

explainer <- h3sdm_explain(model, data = dat, response = "presence")
#> Preparation of a new explainer is initiated
#>   -> model label       :  h3sdm workflow 
#>   -> data              :  20  rows  2  cols 
#>   -> target variable   :  20  values 
#>   -> predict function  :  custom_predict 
#>   -> predicted values  :  No value for predict function target column. (  default  )
#>   -> model_info        :  package parsnip , ver. 1.3.3 , task classification (  default  ) 
#>   -> predicted values  :  numerical, min =  0.2165252 , mean =  0.5 , max =  0.7312219  
#>   -> residual function :  difference between y and yhat (  default  )
#>   -> residuals         :  numerical, min =  -0.6924869 , mean =  -1.839084e-14 , max =  0.5992426  
#>   A new explainer has been created!  
feature_importance(explainer)
#>       variable mean_dropout_loss          label
#> 1 _full_model_             0.390 h3sdm workflow
#> 2           x1             0.406 h3sdm workflow
#> 3           x2             0.554 h3sdm workflow
#> 4   _baseline_             0.480 h3sdm workflow
# }