| Title: | Bayesian Neural Network with 'Stan' |
|---|---|
| Description: | Offers a flexible formula-based interface for building and training Bayesian Neural Networks powered by 'Stan'. The package supports modeling complex relationships while providing rigorous uncertainty quantification via posterior distributions. With features like user chosen priors, clear predictions, and support for regression, binary, and multi-class classification, it is well-suited for applications in clinical trials, finance, and other fields requiring robust Bayesian inference and decision-making. References: Neal(1996) <doi:10.1007/978-1-4612-0745-0>. |
| Authors: | Swarnendu Chatterjee [aut, cre, cph] |
| Maintainer: | Swarnendu Chatterjee <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 1.0.0 |
| Built: | 2026-06-07 19:44:24 UTC |
| Source: | https://github.com/swarnendu-stat/bnns |
This is a generic function for fitting Bayesian Neural Network (BNN) models. It dispatches to methods based on the class of the input data.
bnns( formula, data, L = 1, nodes = rep(2, L), act_fn = rep(2, L), out_act_fn = 1, algorithm = c("NUTS", "HMC"), iter = 1000, warmup = 200, thin = 1, chains = 2, cores = 2, seed = 123, prior_weights = NULL, prior_bias = NULL, prior_sigma = NULL, verbose = FALSE, refresh = max(iter/10, 1), normalize = TRUE, backend = c("rstan", "cmdstanr"), use_gpu = FALSE, opencl_ids = c(0, 0), ... )bnns( formula, data, L = 1, nodes = rep(2, L), act_fn = rep(2, L), out_act_fn = 1, algorithm = c("NUTS", "HMC"), iter = 1000, warmup = 200, thin = 1, chains = 2, cores = 2, seed = 123, prior_weights = NULL, prior_bias = NULL, prior_sigma = NULL, verbose = FALSE, refresh = max(iter/10, 1), normalize = TRUE, backend = c("rstan", "cmdstanr"), use_gpu = FALSE, opencl_ids = c(0, 0), ... )
formula |
A symbolic description of the model to be fitted. The formula should specify the response variable and predictors (e.g., |
data |
A data frame containing the variables in the model. |
L |
An integer specifying the number of hidden layers in the neural network. Default is 1. |
nodes |
An integer or vector specifying the number of nodes in each hidden layer. If a single value is provided, it is applied to all layers. Default is 16. |
act_fn |
An integer or vector specifying the activation function(s) for the hidden layers. Options are:
|
out_act_fn |
An integer or character string specifying the activation function for the output layer. Options are:
|
algorithm |
A character string specifying the MCMC algorithm. Options are |
iter |
An integer specifying the total number of iterations for the Stan sampler. Default is |
warmup |
An integer specifying the number of warmup iterations for the Stan sampler. Default is |
thin |
An integer specifying the thinning interval for Stan samples. Default is 1. |
chains |
An integer specifying the number of Markov chains. Default is 2. |
cores |
An integer specifying the number of CPU cores to use for parallel sampling. Default is 2. |
seed |
An integer specifying the random seed for reproducibility. Default is 123. |
prior_weights |
A list specifying the prior distribution for the weights in the neural network. The list must include two components:
For the
|
prior_bias |
A list specifying the prior distribution for the biases in the neural network. The list must include two components:
If
|
prior_sigma |
A list specifying the prior distribution for the
If
|
verbose |
TRUE or FALSE: flag indicating whether to print intermediate output from Stan on the console, which might be helpful for model debugging. |
refresh |
refresh (integer) can be used to control how often the progress of the sampling is reported (i.e. show the progress every refresh iterations). By default, refresh = max(iter/10, 1). The progress indicator is turned off if refresh <= 0. |
normalize |
Logical. If |
backend |
A character string specifying the Stan backend to use. Options are |
use_gpu |
Logical. If |
opencl_ids |
A vector of two integers specifying the OpenCL platform and device IDs. Default is |
... |
Currently not in use. |
The function serves as a generic interface to different methods of fitting Bayesian Neural Networks. The specific method dispatched depends on the class of the input arguments, allowing for flexibility in the types of inputs supported.
The result of the method dispatched by the class of the input data. Typically, this would be an object of class "bnns" containing the fitted model and associated information.
Bishop, C.M., 1995. Neural networks for pattern recognition. Oxford university press.
Carpenter, B., Gelman, A., Hoffman, M.D., Lee, D., Goodrich, B., Betancourt, M., Brubaker, M.A., Guo, J., Li, P. and Riddell, A., 2017. Stan: A probabilistic programming language. Journal of statistical software, 76.
Neal, R.M., 2012. Bayesian learning for neural networks (Vol. 118). Springer Science & Business Media.
# Example usage with formula interface: data <- data.frame(x1 = runif(10), x2 = runif(10), y = rnorm(10)) model <- bnns(y ~ -1 + x1 + x2, data = data, L = 1, nodes = 2, act_fn = 1, iter = 1e1, warmup = 5, chains = 1 ) # See the documentation for bnns.default for more details on the default implementation.# Example usage with formula interface: data <- data.frame(x1 = runif(10), x2 = runif(10), y = rnorm(10)) model <- bnns(y ~ -1 + x1 + x2, data = data, L = 1, nodes = 2, act_fn = 1, iter = 1e1, warmup = 5, chains = 1 ) # See the documentation for bnns.default for more details on the default implementation.
These functions provide dials parameter objects for tuning
the hyperparameters of Bayesian Neural Networks.
L(range = c(1L, 5L), trans = NULL) warmup(range = c(100L, 1000L), trans = NULL) chains(range = c(1L, 4L), trans = NULL) iter(range = c(500L, 2000L), trans = NULL) nodes(range = c(1L, 64L), trans = NULL) act_fn(values = c("tanh", "sigmoid", "softplus", "relu", "linear"))L(range = c(1L, 5L), trans = NULL) warmup(range = c(100L, 1000L), trans = NULL) chains(range = c(1L, 4L), trans = NULL) iter(range = c(500L, 2000L), trans = NULL) nodes(range = c(1L, 64L), trans = NULL) act_fn(values = c("tanh", "sigmoid", "softplus", "relu", "linear"))
range |
A two-element vector holding the defaults for the smallest and largest possible values. |
trans |
A |
values |
A character vector of possible values. |
A quant_param or qual_param object from the dials package.
Fits a Bayesian Neural Network (BNN) model using a formula interface. The function parses the formula and data to create the input feature matrix and target vector, then fits the model using bnns.default.
## Default S3 method: bnns( formula, data, L = 1, nodes = rep(2, L), act_fn = rep(2, L), out_act_fn = 1, algorithm = c("NUTS", "HMC"), iter = 1000, warmup = 200, thin = 1, chains = 2, cores = 2, seed = 123, prior_weights = NULL, prior_bias = NULL, prior_sigma = NULL, verbose = FALSE, refresh = max(iter/10, 1), normalize = TRUE, backend = c("rstan", "cmdstanr"), use_gpu = FALSE, opencl_ids = c(0, 0), ... )## Default S3 method: bnns( formula, data, L = 1, nodes = rep(2, L), act_fn = rep(2, L), out_act_fn = 1, algorithm = c("NUTS", "HMC"), iter = 1000, warmup = 200, thin = 1, chains = 2, cores = 2, seed = 123, prior_weights = NULL, prior_bias = NULL, prior_sigma = NULL, verbose = FALSE, refresh = max(iter/10, 1), normalize = TRUE, backend = c("rstan", "cmdstanr"), use_gpu = FALSE, opencl_ids = c(0, 0), ... )
formula |
A symbolic description of the model to be fitted. The formula should specify the response variable and predictors (e.g., |
data |
A data frame containing the variables in the model. |
L |
An integer specifying the number of hidden layers in the neural network. Default is 1. |
nodes |
An integer or vector specifying the number of nodes in each hidden layer. If a single value is provided, it is applied to all layers. Default is 16. |
act_fn |
An integer or vector specifying the activation function(s) for the hidden layers. Options are:
|
out_act_fn |
An integer or character string specifying the activation function for the output layer. Options are:
|
algorithm |
A character string specifying the MCMC algorithm. Options are |
iter |
An integer specifying the total number of iterations for the Stan sampler. Default is |
warmup |
An integer specifying the number of warmup iterations for the Stan sampler. Default is |
thin |
An integer specifying the thinning interval for Stan samples. Default is 1. |
chains |
An integer specifying the number of Markov chains. Default is 2. |
cores |
An integer specifying the number of CPU cores to use for parallel sampling. Default is 2. |
seed |
An integer specifying the random seed for reproducibility. Default is 123. |
prior_weights |
A list specifying the prior distribution for the weights in the neural network. The list must include two components:
For the
|
prior_bias |
A list specifying the prior distribution for the biases in the neural network. The list must include two components:
If
|
prior_sigma |
A list specifying the prior distribution for the
If
|
verbose |
TRUE or FALSE: flag indicating whether to print intermediate output from Stan on the console, which might be helpful for model debugging. |
refresh |
refresh (integer) can be used to control how often the progress of the sampling is reported (i.e. show the progress every refresh iterations). By default, refresh = max(iter/10, 1). The progress indicator is turned off if refresh <= 0. |
normalize |
Logical. If |
backend |
A character string specifying the Stan backend to use. Options are |
use_gpu |
Logical. If |
opencl_ids |
A vector of two integers specifying the OpenCL platform and device IDs. Default is |
... |
Currently not in use. |
The function uses the provided formula and data to generate the design matrix for the predictors and the response vector. It then calls helper function bnns_train to fit the Bayesian Neural Network model.
An object of class "bnns" containing the fitted model and associated information, including:
fit: The fitted Stan model object.
data: A list containing the processed training data.
call: The matched function call.
formula: The formula used for the model.
# Example usage: data <- data.frame(x1 = runif(10), x2 = runif(10), y = rnorm(10)) model <- bnns(y ~ -1 + x1 + x2, data = data, L = 1, nodes = 2, act_fn = 3, iter = 1e1, warmup = 5, chains = 1 )# Example usage: data <- data.frame(x1 = runif(10), x2 = runif(10), y = rnorm(10)) model <- bnns(y ~ -1 + x1 + x2, data = data, L = 1, nodes = 2, act_fn = 3, iter = 1e1, warmup = 5, chains = 1 )
Load a fitted bnns model from disk
load_bnns(file)load_bnns(file)
file |
A character string specifying the path to the saved model. |
A fitted bnns object.
Leave-One-Out Cross-Validation (LOO) for bnns models
## S3 method for class 'bnns' loo(x, ...)## S3 method for class 'bnns' loo(x, ...)
x |
A fitted |
... |
Additional arguments passed to |
A loo object containing model comparison metrics.
Evaluates the performance of a binary classification model using a confusion matrix and accuracy.
measure_bin(obs, pred, cut = 0.5)measure_bin(obs, pred, cut = 0.5)
obs |
A numeric or integer vector of observed binary class labels (0 or 1). |
pred |
A numeric vector of predicted probabilities for the positive class. |
cut |
A numeric threshold (between 0 and 1) to classify predictions into binary labels. |
A list containing:
conf_matA confusion matrix comparing observed and predicted class labels.
accuracyThe proportion of correct predictions.
ROCROC generated using pROC::roc
AUCArea under the ROC curve.
obs <- c(1, 0, 1, 1, 0) pred <- c(0.9, 0.4, 0.8, 0.7, 0.3) cut <- 0.5 measure_bin(obs, pred, cut) # Returns: list(conf_mat = <confusion matrix>, accuracy = 1, ROC = <ROC>, AUC = 1)obs <- c(1, 0, 1, 1, 0) pred <- c(0.9, 0.4, 0.8, 0.7, 0.3) cut <- 0.5 measure_bin(obs, pred, cut) # Returns: list(conf_mat = <confusion matrix>, accuracy = 1, ROC = <ROC>, AUC = 1)
Evaluates the performance of a multi-class classification model using log loss and multiclass AUC.
measure_cat(obs, pred)measure_cat(obs, pred)
obs |
A factor vector of observed class labels. Each level represents a unique class. |
pred |
A numeric matrix of predicted probabilities, where each row corresponds to an observation,
and each column corresponds to a class. The number of columns must match the number of levels in |
The log loss is calculated as:
where is 1 if observation belongs to class , and is the
predicted probability for that class.
The AUC is computed using the pROC::multiclass.roc function, which provides an overall measure
of model performance for multiclass classification.
A list containing:
log_lossThe negative log-likelihood averaged across observations.
ROCROC generated using pROC::roc
AUCThe multiclass Area Under the Curve (AUC) as computed by pROC::multiclass.roc.
library(pROC) obs <- factor(c("A", "B", "C"), levels = LETTERS[1:3]) pred <- matrix( c( 0.8, 0.1, 0.1, 0.2, 0.6, 0.2, 0.7, 0.2, 0.1 ), nrow = 3, byrow = TRUE ) measure_cat(obs, pred) # Returns: list(log_loss = 1.012185, ROC = <ROC>, AUC = 0.75)library(pROC) obs <- factor(c("A", "B", "C"), levels = LETTERS[1:3]) pred <- matrix( c( 0.8, 0.1, 0.1, 0.2, 0.6, 0.2, 0.7, 0.2, 0.1 ), nrow = 3, byrow = TRUE ) measure_cat(obs, pred) # Returns: list(log_loss = 1.012185, ROC = <ROC>, AUC = 0.75)
Evaluates the performance of a continuous response model using RMSE and MAE.
measure_cont(obs, pred)measure_cont(obs, pred)
obs |
A numeric vector of observed (true) values. |
pred |
A numeric vector of predicted values. |
A list containing:
rmseRoot Mean Squared Error.
maeMean Absolute Error.
obs <- c(3.2, 4.1, 5.6) pred <- c(3.0, 4.3, 5.5) measure_cont(obs, pred) # Returns: list(rmse = 0.1732051, mae = 0.1666667)obs <- c(3.2, 4.1, 5.6) pred <- c(3.0, 4.3, 5.5) measure_cont(obs, pred) # Returns: list(rmse = 0.1732051, mae = 0.1666667)
This helper function lists the available OpenCL platforms and devices
on your system. It is useful for determining the correct opencl_ids
to pass to bnns() when using GPU acceleration.
opencl_diagnostics()opencl_diagnostics()
The function first checks if the clinfo system command is available.
If not, it falls back to looking for the OpenCL R package to retrieve the
platforms and devices.
Invoked for its side effect of printing OpenCL diagnostic information.
Generates Markov Chain Monte Carlo (MCMC) trace plots, posterior density plots, Posterior Predictive Checks (PPC), or predicted probability distributions for the fitted model.
## S3 method for class 'bnns' plot( x, type = c("trace", "density", "posterior_predictive", "pred_prob"), pars = NULL, ... )## S3 method for class 'bnns' plot( x, type = c("trace", "density", "posterior_predictive", "pred_prob"), pars = NULL, ... )
x |
A fitted |
type |
Character string indicating the type of plot.
Options are |
pars |
A character vector of parameter names to include in the plot.
By default, this focuses on the output layer ( |
... |
Additional arguments passed to |
A ggplot object containing the requested diagnostic plots.
Predictions from a fitted Bayesian Neural Network
## S3 method for class 'bnns' predict( object, newdata = NULL, type = c("samples", "mean", "median", "quantile", "prob", "class"), quantiles = c(0.025, 0.975), ... )## S3 method for class 'bnns' predict( object, newdata = NULL, type = c("samples", "mean", "median", "quantile", "prob", "class"), quantiles = c(0.025, 0.975), ... )
object |
A fitted |
newdata |
A data frame containing new data for prediction. If not provided, the predictions will be generated using the training data. |
type |
Character string indicating the type of prediction.
Options are |
quantiles |
Numeric vector of probabilities used when |
... |
Additional arguments passed to internal prediction methods. |
For type = "samples": A matrix (regression/binary) or 3D array (multiclass) of posterior predictions.
For type = "mean" or "median": A vector or matrix of aggregated predictions. For classification tasks, type = "mean" returns the posterior mean class probabilities.
For type = "quantile": A matrix or array of quantiles.
For type = "prob": A matrix of class probabilities (for classification models).
For type = "class": A vector of predicted class labels (for classification models).
"bnns" ObjectsDisplays a summary of a fitted Bayesian Neural Network (BNN) model, including the function call and the Stan fit details.
## S3 method for class 'bnns' print(x, ...)## S3 method for class 'bnns' print(x, ...)
x |
An object of class |
... |
Additional arguments (currently not used). |
The function is called for its side effects and does not return a value. It prints the following:
The function call used to generate the "bnns" object.
A summary of the Stan fit object stored in x$fit.
# Example usage: data <- data.frame(x1 = runif(10), x2 = runif(10), y = rnorm(10)) model <- bnns(y ~ -1 + x1 + x2, data = data, L = 1, nodes = 2, act_fn = 2, iter = 1e1, warmup = 5, chains = 1 ) print(model)# Example usage: data <- data.frame(x1 = runif(10), x2 = runif(10), y = rnorm(10)) model <- bnns(y ~ -1 + x1 + x2, data = data, L = 1, nodes = 2, act_fn = 2, iter = 1e1, warmup = 5, chains = 1 ) print(model)
relu transformation
relu(x)relu(x)
x |
A numeric vector or matrix on which relu transformation is going to be applied. |
A numeric vector or matrix after relu transformation.
relu(matrix(1:4, , nrow = 2))relu(matrix(1:4, , nrow = 2))
Save a fitted bnns model to disk
save_bnns(object, file)save_bnns(object, file)
object |
A fitted |
file |
A character string specifying the path where the model should be saved (usually ending in .rds). |
sigmoid transformation
sigmoid(x)sigmoid(x)
x |
A numeric vector or matrix on which sigmoid transformation is going to be applied. |
A numeric vector or matrix after sigmoid transformation.
sigmoid(matrix(1:4, nrow = 2))sigmoid(matrix(1:4, nrow = 2))
This function applies the softmax transformation along the third dimension of a 3D array. The softmax function converts raw scores into probabilities such that they sum to 1 for each slice along the third dimension.
softmax_3d(x)softmax_3d(x)
x |
A 3D array. The input array on which the softmax function will be applied. |
The softmax transformation is computed as:
This is applied for each pair of indices (i, j) across the third dimension (k).
The function processes the input array slice-by-slice for the first two dimensions
(i, j), normalizing the values along the third dimension (k) for each slice.
A 3D array of the same dimensions as x, where the values along the
third dimension are transformed using the softmax function.
# Example: Apply softmax to a 3D array x <- array(runif(24), dim = c(2, 3, 4)) # Random 3D array (2x3x4) softmax_result <- softmax_3d(x)# Example: Apply softmax to a 3D array x <- array(runif(24), dim = c(2, 3, 4)) # Random 3D array (2x3x4) softmax_result <- softmax_3d(x)
softplus transformation
softplus(x)softplus(x)
x |
A numeric vector or matrix on which softplus transformation is going to be applied. |
A numeric vector or matrix after softplus transformation.
softplus(matrix(1:4, nrow = 2))softplus(matrix(1:4, nrow = 2))
Provides a comprehensive summary of a fitted Bayesian Neural Network (BNN) model, including details about the model call, data, network architecture, posterior distributions, and model fitting information.
## S3 method for class 'bnns' summary(object, ...)## S3 method for class 'bnns' summary(object, ...)
object |
An object of class |
... |
Additional arguments (currently unused). |
The function prints the following information:
Call: The original function call used to fit the model.
Data Summary: Number of observations and features in the training data.
Network Architecture: Structure of the BNN including the number of hidden layers, nodes per layer, and activation functions.
Posterior Summary: Summarized posterior distributions of key parameters (e.g., weights, biases, and noise parameter).
Model Fit Information: Bayesian sampling details, including the number of iterations, warmup period, thinning, and chains.
Notes: Remarks and warnings, such as checks for convergence diagnostics.
A list (returned invisibly) containing the following elements:
"Number of observations": The number of observations in the training data.
"Number of features": The number of features in the training data.
"Number of hidden layers": The number of hidden layers in the neural network.
"Nodes per layer": A comma-separated string representing the number of nodes in each hidden layer.
"Activation functions": A comma-separated string representing the activation functions used in each hidden layer.
"Output activation function": The activation function used in the output layer.
"Stanfit Summary": A summary of the Stan model, including key parameter posterior distributions.
"Iterations": The total number of iterations used for sampling in the Bayesian model.
"Warmup": The number of iterations used as warmup in the Bayesian model.
"Thinning": The thinning interval used in the Bayesian model.
"Chains": The number of Markov chains used in the Bayesian model.
"Performance": Predictive performance metrics, which vary based on the output activation function.
The function also prints the summary to the console.
# Fit a Bayesian Neural Network data <- data.frame(x1 = runif(10), x2 = runif(10), y = rnorm(10)) model <- bnns(y ~ -1 + x1 + x2, data = data, L = 1, nodes = 2, act_fn = 2, iter = 1e1, warmup = 5, chains = 1 ) # Get a summary of the model summary(model)# Fit a Bayesian Neural Network data <- data.frame(x1 = runif(10), x2 = runif(10), y = rnorm(10)) model <- bnns(y ~ -1 + x1 + x2, data = data, L = 1, nodes = 2, act_fn = 2, iter = 1e1, warmup = 5, chains = 1 ) # Get a summary of the model summary(model)
Watanabe-Akaike Information Criterion (WAIC) for bnns models
## S3 method for class 'bnns' waic(x, ...)## S3 method for class 'bnns' waic(x, ...)
x |
A fitted |
... |
Additional arguments passed to |
A waic object containing model comparison metrics.