The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
This vignette provides a comprehensive guide to using
kerasnip
to define sequential Keras models within the
tidymodels
ecosystem. kerasnip
bridges the gap
between the imperative, layer-by-layer construction of Keras models and
the declarative, specification-based approach of
tidymodels
.
Here, we will focus on create_keras_sequential_spec()
,
which is ideal for models where layers form a plain stack, with each
layer having exactly one input tensor and one output tensor.
We’ll start by loading the necessary packages:
create_keras_sequential_spec()
A Sequential
model in Keras is appropriate for a plain
stack of layers where each layer has exactly one input tensor and one
output tensor. kerasnip
’s
create_keras_sequential_spec()
function is designed to
define such models in a tidymodels
-compatible way.
Instead of building the model layer-by-layer imperatively, you define
a named, ordered list of R functions called layer_blocks
.
Each layer_block
function takes a Keras model object as its
first argument and returns the modified model. kerasnip
then uses these blocks to construct the full Keras Sequential model.
For models with more complex, non-linear topologies (e.g., multiple
inputs/outputs, residual connections, or multi-branch models), you
should use create_keras_functional_spec()
.
kerasnip
Sequential Model SpecificationLet’s define a simple sequential model with three dense layers.
First, we define our layer_blocks
:
# The first block must initialize the model. `input_shape`
# is passed automatically.
input_block <- function(model, input_shape) {
keras_model_sequential(input_shape = input_shape)
}
# A reusable block for hidden layers. `units` will become a tunable parameter.
hidden_block <- function(model, units = 32, activation = "relu") {
model |> layer_dense(units = units, activation = activation)
}
# The output block. `num_classes` is passed automatically for classification.
output_block <- function(model, num_classes, activation = "softmax") {
model |> layer_dense(units = num_classes, activation = activation)
}
Now, we use create_keras_sequential_spec()
to generate
our parsnip
model specification function. We’ll name our
model my_simple_mlp
.
compile_keras_grid()
In the original Keras guide, a common workflow is to incrementally
add layers and call summary()
to inspect the architecture.
With kerasnip
, the model is defined declaratively, so we
can’t inspect it layer-by-layer in the same way.
However, kerasnip
provides a powerful equivalent:
compile_keras_grid()
. This function checks if your
layer_blocks
define a valid Keras model and returns the
compiled model structure, all without running a full training cycle.
This is perfect for debugging your architecture.
Let’s see this in action with a CNN architecture:
# Define CNN layer blocks
cnn_input_block <- function(model, input_shape) {
keras_model_sequential(input_shape = input_shape)
}
cnn_conv_block <- function(
model,
filters = 32,
kernel_size = 3,
activation = "relu"
) {
model |>
layer_conv_2d(
filters = filters,
kernel_size = kernel_size,
activation = activation
)
}
cnn_pool_block <- function(model, pool_size = 2) {
model |> layer_max_pooling_2d(pool_size = pool_size)
}
cnn_flatten_block <- function(model) {
model |> layer_flatten()
}
cnn_output_block <- function(model, num_classes, activation = "softmax") {
model |> layer_dense(units = num_classes, activation = activation)
}
# Create the kerasnip spec function
create_keras_sequential_spec(
model_name = "my_cnn",
layer_blocks = list(
input = cnn_input_block,
conv1 = cnn_conv_block,
pool1 = cnn_pool_block,
flatten = cnn_flatten_block,
output = cnn_output_block
),
mode = "classification"
)
# Create a spec instance for a 28x28x1 image
cnn_spec <- my_cnn(
conv1_filters = 32, conv1_kernel_size = 5,
compile_loss = "categorical_crossentropy",
compile_optimizer = "adam"
)
# Prepare dummy data with the correct shape.
# We create a list of 28x28x1 arrays.
x_dummy_list <- lapply(
1:10,
function(i) array(runif(28 * 28 * 1), dim = c(28, 28, 1))
)
x_dummy_df <- tibble::tibble(x = x_dummy_list)
y_dummy <- factor(sample(0:9, 10, replace = TRUE), levels = 0:9)
y_dummy_df <- tibble::tibble(y = y_dummy)
# Use compile_keras_grid to get the model summary
compilation_results <- compile_keras_grid(
spec = cnn_spec,
grid = tibble::tibble(),
x = x_dummy_df,
y = y_dummy_df
)
# Print the summary
compilation_results |>
select(compiled_model) |>
pull() |>
pluck(1) |>
summary()
#> Model: "sequential_83"
#> ┌───────────────────────────────────┬──────────────────────────┬───────────────
#> │ Layer (type) │ Output Shape │ Param #
#> ├───────────────────────────────────┼──────────────────────────┼───────────────
#> │ conv2d (Conv2D) │ (None, 24, 24, 32) │ 832
#> ├───────────────────────────────────┼──────────────────────────┼───────────────
#> │ max_pooling2d (MaxPooling2D) │ (None, 12, 12, 32) │ 0
#> ├───────────────────────────────────┼──────────────────────────┼───────────────
#> │ flatten (Flatten) │ (None, 4608) │ 0
#> ├───────────────────────────────────┼──────────────────────────┼───────────────
#> │ dense_255 (Dense) │ (None, 10) │ 46,090
#> └───────────────────────────────────┴──────────────────────────┴───────────────
#> Total params: 46,922 (183.29 KB)
#> Trainable params: 46,922 (183.29 KB)
#> Non-trainable params: 0 (0.00 B)
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.