Skip to content

This function prepares an sf object for use in a Species Distribution Model (SDM) workflow with the 'mgcv' GAM engine within the 'tidymodels' ecosystem. Extracts centroid coordinates and assigns appropriate roles to all variables, including the response variable and spatial coordinates.

Usage

h3sdm_recipe_gam(data, response_col = "presence")

Arguments

data

An sf object containing the response variable, environmental predictors, and geometry (e.g., H3 hexagon polygons).

response_col

character Name of the column to use as the outcome (response variable). Default "presence" for presence/absence models. Use "count" when working with count data generated by h3sdm_count_from_records().

Value

A recipe object of class h3sdm_recipe_gam, ready to be chained with additional preprocessing steps.

Details

Assigned Roles:

  • outcome: the column specified in response_col.

  • id: "h3_address" (cell identifier, not used for modeling).

  • predictor: all other variables, including x and y for the GAM spatial smooth term (s(x, y, bs = "tp")).

See also

Examples

# \donttest{
  library(sf)
  library(recipes)
#> Loading required package: dplyr
#> 
#> Attaching package: ‘dplyr’
#> The following object is masked from ‘package:DALEX’:
#> 
#>     explain
#> The following objects are masked from ‘package:terra’:
#> 
#>     intersect, union
#> The following objects are masked from ‘package:stats’:
#> 
#>     filter, lag
#> The following objects are masked from ‘package:base’:
#> 
#>     intersect, setdiff, setequal, union
#> 
#> Attaching package: ‘recipes’
#> The following object is masked from ‘package:terra’:
#> 
#>     update
#> The following object is masked from ‘package:stats’:
#> 
#>     step

  set.seed(42)
  n <- 20

  pts <- sf::st_as_sf(
    data.frame(
      h3_address   = paste0("hex_", seq_len(n)),
      presence     = sample(0:1, n, replace = TRUE),
      count        = sample(0:9, n, replace = TRUE),
      bio1_temp    = runif(n, 15, 30),
      bio12_precip = runif(n, 500, 3000)
    ),
    geometry = sf::st_sfc(
      lapply(seq_len(n), function(i) {
        sf::st_point(c(runif(1, -84.5, -83.5), runif(1, 9.5, 10.5)))
      }),
      crs = 4326
    )
  )

  # Presence/absence model (default)
  gam_rec <- h3sdm_recipe_gam(pts)

  # Count-based model
  gam_rec <- h3sdm_recipe_gam(pts, response_col = "count")
# }