| Title: | Reinforcement Learning Tools for Two-Alternative Forced Choice Tasks |
|---|---|
| Description: | Tools for building Rescorla-Wagner Models for Two-Alternative Forced Choice tasks, commonly employed in psychological research. Most concepts and ideas within this R package are referenced from Sutton and Barto (2018) <ISBN:9780262039246>. The package allows for the intuitive definition of RL models using simple if-else statements and three basic models built into this R package are referenced from Niv et al. (2012) <doi:10.1523/JNEUROSCI.5498-10.2012>. Our approach to constructing and evaluating these computational models is informed by the guidelines proposed in Wilson & Collins (2019) <doi:10.7554/eLife.49547>. Example datasets included with the package are sourced from the work of Mason et al. (2024) <doi:10.3758/s13423-023-02415-x>. |
| Authors: | YuKi [aut, cre] (ORCID: <https://orcid.org/0009-0000-1378-1318>) |
| Maintainer: | YuKi <[email protected]> |
| License: | GPL-3 |
| Version: | 0.9.11 |
| Built: | 2026-05-12 08:50:20 UTC |
| Source: | https://github.com/yuki-961004/binaryrl |
This function is designed to fit the optimal parameters of black-box
functions (models) to real-world data. Provided that the black-box
function adheres to the specified interface
(demo:
TD,
RSTD,
Utility
)
, this function can employ the various optimization algorithms detailed
below to find the best- fitting parameters for your model.
The function provides several optimization algorithms:
1. L-BFGS-B (from stats::optim)
2. Simulated Annealing (GenSA::GenSA)
3. Genetic Algorithm (GA::ga)
4. Differential Evolution (DEoptim::DEoptim)
5. Particle Swarm Optimization (pso::psoptim)
6. Bayesian Optimization (mlrMBO::mbo)
7. Covariance Matrix Adapting Evolutionary Strategy (cmaes::cma_es)
8. Nonlinear Optimization (nloptr::nloptr)
For more information, please refer to the homepage of this package: https://yuki-961004.github.io/binaryRL/
fit_p( estimate = "MLE", policy = "off", data, id = NULL, n_trials = NULL, funcs = NULL, model_name = c("TD", "RSTD", "Utility"), fit_model = list(binaryRL::TD, binaryRL::RSTD, binaryRL::Utility), priors = NULL, lower = list(c(0, 0), c(0, 0, 0), c(0, 0, 0)), upper = list(c(1, 5), c(1, 1, 5), c(1, 1, 5)), initial_params = NA, initial_size = 50, tolerance = 0.001, seed = 123, iteration_i = 10, iteration_g = 0, nc = 1, algorithm )fit_p( estimate = "MLE", policy = "off", data, id = NULL, n_trials = NULL, funcs = NULL, model_name = c("TD", "RSTD", "Utility"), fit_model = list(binaryRL::TD, binaryRL::RSTD, binaryRL::Utility), priors = NULL, lower = list(c(0, 0), c(0, 0, 0), c(0, 0, 0)), upper = list(c(1, 5), c(1, 1, 5), c(1, 1, 5)), initial_params = NA, initial_size = 50, tolerance = 0.001, seed = 123, iteration_i = 10, iteration_g = 0, nc = 1, algorithm )
estimate |
[string] Estimation method. Can be either
default: |
policy |
[string] Specifies the learning policy to be used.
This determines how the model updates action values based on observed or
simulated choices. It can be either
default: |
data |
[data.frame] This data should include the following mandatory columns:
|
id |
[CharacterVector] A vector specifying the subject ID(s) for which parameters should be fitted. The function will process only the subjects provided in this vector. To fit all subjects, you can either explicitly set the argument as
It is strongly recommended to avoid using simple numeric sequences like
default: |
n_trials |
[integer] Represents the total number of trials a single subject experienced
in the experiment. If this parameter is kept at its default value
of default: |
funcs |
[CharacterVector] A character vector containing the names of all user-defined functions
required for the computation. When parallel computation is enabled
(i.e., Therefore, if you have created your own reinforcement learning model
that modifies the package's default six default functions
(default functions:
|
model_name |
[List] The name of fit modals e.g. |
fit_model |
[List] A collection of functions applied to fit models to the data. e.g. |
priors |
[List] A list specifying the prior distributions for the model parameters.
This argument is mandatory when using 1. Static MAP Estimation (Non-Hierarchical) This approach is used when you have a strong, pre-defined belief about the parameter priors and do not want the model to update them iteratively.
2. Hierarchical Bayesian Estimation via EM This approach is used to let the model learn the group-level (hierarchical) prior distributions directly from the data.
default: |
lower |
[List] The lower bounds for model fit models e.g. |
upper |
[List] The upper bounds for model fit models e.g. |
initial_params |
[NumericVector] Initial values for the free parameters that the optimization algorithm will
search from. These are primarily relevant when using algorithms that require
an explicit starting point, such as default: |
initial_size |
[integer] This parameter corresponds to the population size in genetic
algorithms ( default: |
tolerance |
[double] Convergence threshold for MAP estimation. If the change in log posterior probability between iterations is smaller than this value, the algorithm is considered to have converged and the program will stop. default: |
seed |
[integer] Random seed. This ensures that the results are reproducible and remain the same each time the function is run. default: |
iteration_i |
[integer] The number of iterations the optimization algorithm will perform when searching for the best-fitting parameters during the fitting phase. A higher number of iterations may increase the likelihood of finding a global optimum but also increases computation time. default: |
iteration_g |
[integer] The maximum number of iterations for the Expectation-Maximization (EM)
based MAP estimation. The algorithm will stop once this iteration
count is reached, even if the change in the log-posterior value has
not yet fallen below the default: |
nc |
[integer] Number of cores to use for parallel processing. Since fitting optimal parameters for each subject is an independent task, parallel computation can significantly speed up the fitting process:
|
algorithm |
[string] Choose an algorithm package from
In addition, any algorithm from the e.g. |
The optimal parameters found by the algorithm for each subject,
along with the model fit calculated using these parameters.
This is returned as an object of class binaryRL containing results
for all subjects with all models.
While both fit_p and rcv_d utilize the same underlying
optimize_para function to find optimal parameters, they play
distinct and sequential roles in the modeling pipeline.
The key differences are as follows:
Purpose and Data Source: rcv_d should always be
performed before fit_p. Its primary role is to validate a
model's stability by fitting it to synthetic data generated by the
model itself. This process, known as parameter recovery, ensures the
model is well-behaved. In contrast, fit_p is used in the
subsequent stage to fit the validated model to real experimental
data.
Estimation Method: rcv_d does not include an
estimate argument. This is because the synthetic data is
generated from known "true" parameters, which are drawn from
pre-defined distributions (typically uniform for most parameters and
exponential for the inverse temperature). Since the ground truth is
known, a hierarchical estimation (MAP) is not applicable. The
fit_p function, however, requires this argument to handle
real data where the true parameters are unknown.
Policy Setting: In fit_p, the policy
setting has different effects: "on-policy" is better for learning
choice patterns, while "off-policy" yields more accurate parameter
estimates. For rcv_d, the process defaults to an "off-policy"
approach because its main objectives are to verify if the true
parameters can be accurately recovered and to assess whether competing
models are distinguishable, tasks for which off-policy estimation is
more suitable.
## Not run: comparison <- binaryRL::fit_p( data = binaryRL::Mason_2024_G2, #+-----------------------------------------------------------------------------+# #|----------------------------- black-box function ----------------------------|# funcs = c("your_funcs"), estimate = c("MLE", "MAP"), policy = c("off", "on"), model_name = c("TD", "RSTD", "Utility"), #|-------------------------------- fit models ---------------------------------|# fit_model = list(binaryRL::TD, binaryRL::RSTD, binaryRL::Utility), priors = list( list( eta = function(x) {stats::dunif(x, min = 0, max = 1, log = TRUE)}, tau = function(x) {stats::dexp(x, rate = 1, log = TRUE)} ), list( eta = function(x) {stats::dunif(x, min = 0, max = 1, log = TRUE)}, eta = function(x) {stats::dunif(x, min = 0, max = 1, log = TRUE)}, tau = function(x) {stats::dexp(x, rate = 1, log = TRUE)} ), list( eta = function(x) {stats::dunif(x, min = 0, max = 1, log = TRUE)}, gamma = function(x) {stats::dunif(x, min = 0, max = 1, log = TRUE)}, tau = function(x) {stats::dexp(x, rate = 1, log = TRUE)} ) ), #|---------------------------------- bound ------------------------------------|# lower = list(c(0, 0), c(0, 0, 0), c(0, 0, 0)), upper = list(c(1, 10), c(1, 1, 10), c(1, 1, 10)), #|----------------------------- iteration number ------------------------------|# iteration_i = 10, iteration_g = 10, #|-------------------------------- algorithms ---------------------------------|# nc = 1, # <nc > 1>: parallel computation across subjects # Base R Optimization #algorithm = "L-BFGS-B" # Gradient-Based (stats) #|-----------------------------------------------------------------------------|# # Specialized External Optimization #algorithm = "GenSA" # Simulated Annealing (GenSA) #algorithm = "GA" # Genetic Algorithm (GA) #algorithm = "DEoptim" # Differential Evolution (DEoptim) #algorithm = "PSO" # Particle Swarm Optimization (pso) #algorithm = "Bayesian" # Bayesian Optimization (mlrMBO) #algorithm = "CMA-ES" # Covariance Matrix Adapting (cmaes) #|-----------------------------------------------------------------------------|# # Optimization Library (nloptr) algorithm = c("NLOPT_GN_MLSL", "NLOPT_LN_BOBYQA") #|-------------------------------- algorithms ---------------------------------|# ################################################################################# ) result <- dplyr::bind_rows(comparison) # Ensure the output directory exists before writing if (!dir.exists("../OUTPUT")) { dir.create("../OUTPUT", recursive = TRUE) } write.csv(result, "../OUTPUT/result_comparison.csv", row.names = FALSE) ## End(Not run)## Not run: comparison <- binaryRL::fit_p( data = binaryRL::Mason_2024_G2, #+-----------------------------------------------------------------------------+# #|----------------------------- black-box function ----------------------------|# funcs = c("your_funcs"), estimate = c("MLE", "MAP"), policy = c("off", "on"), model_name = c("TD", "RSTD", "Utility"), #|-------------------------------- fit models ---------------------------------|# fit_model = list(binaryRL::TD, binaryRL::RSTD, binaryRL::Utility), priors = list( list( eta = function(x) {stats::dunif(x, min = 0, max = 1, log = TRUE)}, tau = function(x) {stats::dexp(x, rate = 1, log = TRUE)} ), list( eta = function(x) {stats::dunif(x, min = 0, max = 1, log = TRUE)}, eta = function(x) {stats::dunif(x, min = 0, max = 1, log = TRUE)}, tau = function(x) {stats::dexp(x, rate = 1, log = TRUE)} ), list( eta = function(x) {stats::dunif(x, min = 0, max = 1, log = TRUE)}, gamma = function(x) {stats::dunif(x, min = 0, max = 1, log = TRUE)}, tau = function(x) {stats::dexp(x, rate = 1, log = TRUE)} ) ), #|---------------------------------- bound ------------------------------------|# lower = list(c(0, 0), c(0, 0, 0), c(0, 0, 0)), upper = list(c(1, 10), c(1, 1, 10), c(1, 1, 10)), #|----------------------------- iteration number ------------------------------|# iteration_i = 10, iteration_g = 10, #|-------------------------------- algorithms ---------------------------------|# nc = 1, # <nc > 1>: parallel computation across subjects # Base R Optimization #algorithm = "L-BFGS-B" # Gradient-Based (stats) #|-----------------------------------------------------------------------------|# # Specialized External Optimization #algorithm = "GenSA" # Simulated Annealing (GenSA) #algorithm = "GA" # Genetic Algorithm (GA) #algorithm = "DEoptim" # Differential Evolution (DEoptim) #algorithm = "PSO" # Particle Swarm Optimization (pso) #algorithm = "Bayesian" # Bayesian Optimization (mlrMBO) #algorithm = "CMA-ES" # Covariance Matrix Adapting (cmaes) #|-----------------------------------------------------------------------------|# # Optimization Library (nloptr) algorithm = c("NLOPT_GN_MLSL", "NLOPT_LN_BOBYQA") #|-------------------------------- algorithms ---------------------------------|# ################################################################################# ) result <- dplyr::bind_rows(comparison) # Ensure the output directory exists before writing if (!dir.exists("../OUTPUT")) { dir.create("../OUTPUT", recursive = TRUE) } write.csv(result, "../OUTPUT/result_comparison.csv", row.names = FALSE) ## End(Not run)
The exploration strategy parameters are
threshold, epsilon, and lambda.
Epsilon-first: Used when only threshold is set.
Subjects choose randomly for trials less than threshold and by value
for trials greater than 'threshold.
Epsilon-greedy: Used if threshold is default
(1) and epsilon is set. Subjects explore with probability
epsilon throughout the experiment.
Epsilon-decreasing: Used if threshold is
default (1), and lambda is set. In this strategy, the probability of
random choice (exploration) decreases as trials increase. The
parameter lambda controls the rate at which this probability
declines with each trial.
func_epsilon( i, L_freq, R_freq, L_pick, R_pick, L_value, R_value, var1 = NA, var2 = NA, threshold = 1, epsilon = NA, lambda = NA, alpha, beta )func_epsilon( i, L_freq, R_freq, L_pick, R_pick, L_value, R_value, var1 = NA, var2 = NA, threshold = 1, epsilon = NA, lambda = NA, alpha, beta )
i |
The current row number. |
L_freq |
The frequency of left option appearance |
R_freq |
The frequency of right option appearance |
L_pick |
The number of times left option was picked |
R_pick |
The number of times left option was picked |
L_value |
The value of the left option |
R_value |
The value of the right option |
var1 |
[character] Column name of extra variable 1. If your model uses more than just reward and expected value, and you need other information, such as whether the choice frame is Gain or Loss, then you can input the 'Frame' column as var1 into the model.
|
var2 |
[character] Column name of extra variable 2. If one additional variable, var1, does not meet your needs, you can add another additional variable, var2, into your model.
|
threshold |
[integer]
Controls the initial exploration phase in the epsilon-first strategy.
This is the number of early trials where the subject makes purely random
choices, as they haven't yet learned the options' values. For example,
|
epsilon |
[numeric]
A parameter used in the epsilon-greedy exploration strategy. It
defines the probability of making a completely random choice, as opposed
to choosing based on the relative values of the left and right options.
For example, if
|
lambda |
[vector]
A numeric value that controls the decay rate of exploration probability
in the epsilon-decreasing strategy. A higher
|
alpha |
[vector] Extra parameters that may be used in functions. |
beta |
[vector] Extra parameters that may be used in functions. |
A numeric value, either 0 or 1. 0 indicates no exploration (choice based on value), and 1 indicates exploration (random choice) for that trial.
When customizing these functions, please ensure that you do not modify
the arguments. Instead, only modify the if-else statements or
the internal logic to adapt the function to your needs.
## Not run: func_epsilon <- function( # Trial number i, # Number of times this option has appeared L_freq, R_freq, # Number of times this option has been chosen L_pick, R_pick, # Current value of this option L_value, R_value, # Extra variables var1 = NA, var2 = NA, # Free Parameters threshold = 1, epsilon = NA, lambda = NA, # Extra parameters alpha, beta ){ set.seed(i) # Epsilon-First: random choosing before a certain trial number if (i <= threshold) { try <- 1 } else if (i > threshold & is.na(epsilon) & is.na(lambda)) { try <- 0 # Epsilon-Greedy: random choosing throughout the experiment with probability epsilon } else if (i > threshold & !(is.na(epsilon)) & is.na(lambda)){ try <- sample( c(1, 0), prob = c(epsilon, 1 - epsilon), size = 1 ) # Epsilon-Decreasing: probability of random choosing decreases as trials increase } else if (i > threshold & is.na(epsilon) & !(is.na(lambda))) { try <- sample( c(1, 0), prob = c( 1 / (1 + lambda * i), lambda * i / (1 + lambda * i) ), size = 1 ) } else { try <- "ERROR" } return(try) } ## End(Not run)## Not run: func_epsilon <- function( # Trial number i, # Number of times this option has appeared L_freq, R_freq, # Number of times this option has been chosen L_pick, R_pick, # Current value of this option L_value, R_value, # Extra variables var1 = NA, var2 = NA, # Free Parameters threshold = 1, epsilon = NA, lambda = NA, # Extra parameters alpha, beta ){ set.seed(i) # Epsilon-First: random choosing before a certain trial number if (i <= threshold) { try <- 1 } else if (i > threshold & is.na(epsilon) & is.na(lambda)) { try <- 0 # Epsilon-Greedy: random choosing throughout the experiment with probability epsilon } else if (i > threshold & !(is.na(epsilon)) & is.na(lambda)){ try <- sample( c(1, 0), prob = c(epsilon, 1 - epsilon), size = 1 ) # Epsilon-Decreasing: probability of random choosing decreases as trials increase } else if (i > threshold & is.na(epsilon) & !(is.na(lambda))) { try <- sample( c(1, 0), prob = c( 1 / (1 + lambda * i), lambda * i / (1 + lambda * i) ), size = 1 ) } else { try <- "ERROR" } return(try) } ## End(Not run)
The structure of eta depends on the model type:
Temporal Difference (TD) model:
eta is a single numeric value representing the learning rate.
Risk-Sensitive Temporal Difference (RSTD) model:
eta is a numeric vector of length two, where
eta[1] represents the learning rate for "good" outcomes, which
means the reward is higher than the expected value.
eta[2] represents the learning rate for "bad" outcomes, which
means the reward is lower than the expected value.
func_eta( i, L_freq, R_freq, L_pick, R_pick, L_value, R_value, var1 = NA, var2 = NA, value, utility, reward, occurrence, eta, alpha, beta )func_eta( i, L_freq, R_freq, L_pick, R_pick, L_value, R_value, var1 = NA, var2 = NA, value, utility, reward, occurrence, eta, alpha, beta )
i |
The current row number. |
L_freq |
The frequency of left option appearance |
R_freq |
The frequency of right option appearance |
L_pick |
The number of times left option was picked |
R_pick |
The number of times left option was picked |
L_value |
The value of the left option |
R_value |
The value of the right option |
var1 |
[character] Column name of extra variable 1. If your model uses more than just reward and expected value, and you need other information, such as whether the choice frame is Gain or Loss, then you can input the 'Frame' column as var1 into the model.
|
var2 |
[character] Column name of extra variable 2. If one additional variable, var1, does not meet your needs, you can add another additional variable, var2, into your model.
|
value |
The expected value of the stimulus in the subject's mind at this point in time. |
utility |
The subjective value that the subject assigns to the objective reward. |
reward |
The objective reward received by the subject after selecting a stimulus. |
occurrence |
The number of times the same stimulus has been chosen. |
eta |
[vector]
Parameters used in the Learning Rate Function,
The structure of
|
alpha |
[vector] Extra parameters that may be used in functions. |
beta |
[vector] Extra parameters that may be used in functions. |
learning rate eta
When customizing these functions, please ensure that you do not modify
the arguments. Instead, only modify the if-else statements or
the internal logic to adapt the function to your needs.
## Not run: func_eta <- function( # Trial number i, # Number of times this option has appeared L_freq, R_freq, # Number of times this option has been chosen L_pick, R_pick, # Current value of this option L_value, R_value, # Extra variables var1 = NA, var2 = NA, # Expected value for this stimulus value, # Subjective utility utility, # Reward observed after choice reward, # Occurrence count for this stimulus occurrence, # Free Parameter eta, # Extra parameters alpha, beta ){ ################################# [ TD ] #################################### if (length(eta) == 1) { eta <- as.numeric(eta) } ################################ [ RSTD ] ################################### else if (length(eta) == 2 & utility < value) { eta <- eta[1] } else if (length(eta) == 2 & utility >= value) { eta <- eta[2] } ################################ [ ERROR ] ################################## else { eta <- "ERROR" # Error check } return(eta) } ## End(Not run)## Not run: func_eta <- function( # Trial number i, # Number of times this option has appeared L_freq, R_freq, # Number of times this option has been chosen L_pick, R_pick, # Current value of this option L_value, R_value, # Extra variables var1 = NA, var2 = NA, # Expected value for this stimulus value, # Subjective utility utility, # Reward observed after choice reward, # Occurrence count for this stimulus occurrence, # Free Parameter eta, # Extra parameters alpha, beta ){ ################################# [ TD ] #################################### if (length(eta) == 1) { eta <- as.numeric(eta) } ################################ [ RSTD ] ################################### else if (length(eta) == 2 & utility < value) { eta <- eta[1] } else if (length(eta) == 2 & utility >= value) { eta <- eta[2] } ################################ [ ERROR ] ################################## else { eta <- "ERROR" # Error check } return(eta) } ## End(Not run)
This function represents an exponent used in calculating utility from reward. Its application varies depending on the specific model:
Stevens' Power Law:
Here, utility is calculated by raising the reward to the power
of gamma. This describes how the subjective value (utility) of a
reward changes non-linearly with its objective magnitude.
Kahneman's Prospect Theory: This theory applies exponents differently for gains and losses, and introduces a loss aversion coefficient:
For positive rewards (gains), utility is the reward
raised to the power of gamma[1].
For negative rewards (losses), utility is calculated
by first multiplying the reward by beta, and then raising
this product to the power of gamma[2]. Here, beta acts
as a loss aversion parameter, accounting for the greater psychological
impact of losses compared to equivalent gains.
func_gamma( i, L_freq, R_freq, L_pick, R_pick, L_value, R_value, var1 = NA, var2 = NA, value, utility, reward, occurrence, gamma = 1, alpha, beta )func_gamma( i, L_freq, R_freq, L_pick, R_pick, L_value, R_value, var1 = NA, var2 = NA, value, utility, reward, occurrence, gamma = 1, alpha, beta )
i |
The current row number. |
L_freq |
The frequency of left option appearance |
R_freq |
The frequency of right option appearance |
L_pick |
The number of times left option was picked |
R_pick |
The number of times left option was picked |
L_value |
The value of the left option |
R_value |
The value of the right option |
var1 |
[character] Column name of extra variable 1. If your model uses more than just reward and expected value, and you need other information, such as whether the choice frame is Gain or Loss, then you can input the 'Frame' column as var1 into the model.
|
var2 |
[character] Column name of extra variable 2. If one additional variable, var1, does not meet your needs, you can add another additional variable, var2, into your model.
|
value |
The expected value of the stimulus in the subject's mind at this point in time. |
utility |
The subjective value that the subject assigns to the objective reward. |
reward |
The objective reward received by the subject after selecting a stimulus. |
occurrence |
The number of times the same stimulus has been chosen. |
gamma |
[vector]
This parameter represents the exponent in utility functions,
|
alpha |
[vector] Extra parameters that may be used in functions. |
beta |
[vector] Extra parameters that may be used in functions. |
Discount rate and utility
When customizing these functions, please ensure that you do not modify
the arguments. Instead, only modify the if-else statements or
the internal logic to adapt the function to your needs.
## Not run: func_gamma <- function( # Trial number i, # Number of times this option has appeared L_freq, R_freq, # Number of times this option has been chosen L_pick, R_pick, # Current value of this option L_value, R_value, # Extra variables var1 = NA, var2 = NA, # Expected value for this stimulus value, # Subjective utility utility, # Reward observed after choice reward, # Occurrence count for this stimulus occurrence, # Free Parameter gamma = 1, # Extra parameters alpha, beta ){ ############################## [ Utility ] ################################## if (length(gamma) == 1) { gamma <- as.numeric(gamma) utility <- sign(reward) * (abs(reward) ^ gamma) } ############################### [ Error ] ################################### else { utility <- "ERROR" } return(list(gamma, utility)) } ## End(Not run)## Not run: func_gamma <- function( # Trial number i, # Number of times this option has appeared L_freq, R_freq, # Number of times this option has been chosen L_pick, R_pick, # Current value of this option L_value, R_value, # Extra variables var1 = NA, var2 = NA, # Expected value for this stimulus value, # Subjective utility utility, # Reward observed after choice reward, # Occurrence count for this stimulus occurrence, # Free Parameter gamma = 1, # Extra parameters alpha, beta ){ ############################## [ Utility ] ################################## if (length(gamma) == 1) { gamma <- as.numeric(gamma) utility <- sign(reward) * (abs(reward) ^ gamma) } ############################### [ Error ] ################################### else { utility <- "ERROR" } return(list(gamma, utility)) } ## End(Not run)
This loss function reflects the similarity between human choices
and RL model predictions. If a human selects the left option and the
RL model predicts a high probability for the left option, then
approaches 0, causing the first term to approach 0.
Since the human chose the left option, becomes 0, making the
second term naturally zero. Therefore, the more consistent the RL model's
prediction is with human choice, the closer this LL value is to 0.
Conversely, it approaches negative infinity.
func_logl( i, L_freq, R_freq, L_pick, R_pick, L_value, R_value, L_dir, R_dir, L_prob, R_prob, var1 = NA, var2 = NA, LR, try, value, utility, reward, occurrence, alpha, beta )func_logl( i, L_freq, R_freq, L_pick, R_pick, L_value, R_value, L_dir, R_dir, L_prob, R_prob, var1 = NA, var2 = NA, LR, try, value, utility, reward, occurrence, alpha, beta )
i |
The current row number. |
L_freq |
The frequency of left option appearance |
R_freq |
The frequency of right option appearance |
L_pick |
The number of times left option was picked |
R_pick |
The number of times left option was picked |
L_value |
The value of the left option |
R_value |
The value of the right option |
L_dir |
Whether the participant chose the left option. |
R_dir |
Whether the participant chose the right option. |
L_prob |
The probability that the model assigns to choosing the left option. |
R_prob |
The probability that the model assigns to choosing the left option. |
var1 |
[character] Column name of extra variable 1. If your model uses more than just reward and expected value, and you need other information, such as whether the choice frame is Gain or Loss, then you can input the 'Frame' column as var1 into the model.
|
var2 |
[character] Column name of extra variable 2. If one additional variable, var1, does not meet your needs, you can add another additional variable, var2, into your model.
|
LR |
Are you calculating the probability for the left option or the right option? |
try |
If the choice was random, the value is 1; If the choice was based on value, the value is 0. |
value |
The expected value of the stimulus in the subject's mind at this point in time. |
utility |
The subjective value that the subject assigns to the objective reward. |
reward |
The objective reward received by the subject after selecting a stimulus. |
occurrence |
The number of times the same stimulus has been chosen. |
alpha |
[vector] Extra parameters that may be used in functions. |
beta |
[vector] Extra parameters that may be used in functions. |
log-likelihood
When customizing these functions, please ensure that you do not modify
the arguments. Instead, only modify the if-else statements or
the internal logic to adapt the function to your needs.
## Not run: func_logl <- function( # Trial number i, # Number of times this option has appeared L_freq, R_freq, # Number of times this option has been chosen L_pick, R_pick, # Current value of this option L_value, R_value, # L_dir, R_dir, # L_prob, R_prob, # Extra variables var1 = NA, var2 = NA, # Whether calculating probability for left or right choice LR, # Is it a random choosing trial? try, # Extra parameters alpha, beta ){ logl <- switch( EXPR = LR, "L" = L_dir * log(L_prob), "R" = R_dir * log(R_prob) ) } ## End(Not run)## Not run: func_logl <- function( # Trial number i, # Number of times this option has appeared L_freq, R_freq, # Number of times this option has been chosen L_pick, R_pick, # Current value of this option L_value, R_value, # L_dir, R_dir, # L_prob, R_prob, # Extra variables var1 = NA, var2 = NA, # Whether calculating probability for left or right choice LR, # Is it a random choosing trial? try, # Extra parameters alpha, beta ){ logl <- switch( EXPR = LR, "L" = L_dir * log(L_prob), "R" = R_dir * log(R_prob) ) } ## End(Not run)
Unlike epsilon-greedy, which explores indiscriminately, UCB is a more
intelligent exploration strategy. It biases the value of each action based
on how often it has been selected. For options chosen fewer times, or those
with high uncertainty, a larger "uncertainty bonus" is added to their
estimated value. This increases their selection probability, effectively
encouraging the exploration of potentially optimal, yet unexplored actions.
A higher pi indicates a greater bias toward giving less-chosen
options.
func_pi( i, L_freq, R_freq, L_pick, R_pick, L_value, R_value, var1, var2, LR, pi, alpha, beta )func_pi( i, L_freq, R_freq, L_pick, R_pick, L_value, R_value, var1, var2, LR, pi, alpha, beta )
i |
The current row number. |
L_freq |
The frequency of left option appearance |
R_freq |
The frequency of right option appearance |
L_pick |
The number of times left option was picked |
R_pick |
The number of times left option was picked |
L_value |
The value of the left option |
R_value |
The value of the right option |
var1 |
[character] Column name of extra variable 1. If your model uses more than just reward and expected value, and you need other information, such as whether the choice frame is Gain or Loss, then you can input the 'Frame' column as var1 into the model.
|
var2 |
[character] Column name of extra variable 2. If one additional variable, var1, does not meet your needs, you can add another additional variable, var2, into your model.
|
LR |
Are you calculating the probability for the left option or the right option? |
pi |
[vector]
Parameter used in the Upper-Confidence-Bound (UCB) action selection
formula.
|
alpha |
[vector] Extra parameters that may be used in functions. |
beta |
[vector] Extra parameters that may be used in functions. |
The probability of choosing this option
When customizing these functions, please ensure that you do not modify
the arguments. Instead, only modify the if-else statements or
the internal logic to adapt the function to your needs.
## Not run: func_tau <- function( # Trial number i, # Number of times this option has appeared L_freq, R_freq, # Number of times this option has been chosen L_pick, R_pick, # Current value of this option L_value, R_value, # Extra variables var1 = NA, var2 = NA, # Whether calculating probability for left or right choice LR, # Free parameter pi = 0.1, # Extra parameters alpha, beta ){ ############################ [ at least 1 ] ################################# if (is.na(x = pi)) { if (L_pick == 0 & R_pick == 0) { bias <- 0 } else if (LR == "L" & L_pick == 0 & R_pick > 0) { bias <- 1e+4 } else if (LR == "R" & R_pick == 0 & L_pick > 0) { bias <- 1e+4 } else { bias <- 0 } } ############################ [ bias value ] ################################# else if (!(is.na(x = pi)) & LR == "L") { bias <- pi * sqrt(log(L_pick + exp(1)) / (L_pick + 1e-10)) } else if (!(is.na(x = pi)) & LR == "R") { bias <- pi * sqrt(log(R_pick + exp(1)) / (R_pick + 1e-10)) } ############################## [ error ] #################################### else { bias <- "ERROR" } return(bias) } ## End(Not run)## Not run: func_tau <- function( # Trial number i, # Number of times this option has appeared L_freq, R_freq, # Number of times this option has been chosen L_pick, R_pick, # Current value of this option L_value, R_value, # Extra variables var1 = NA, var2 = NA, # Whether calculating probability for left or right choice LR, # Free parameter pi = 0.1, # Extra parameters alpha, beta ){ ############################ [ at least 1 ] ################################# if (is.na(x = pi)) { if (L_pick == 0 & R_pick == 0) { bias <- 0 } else if (LR == "L" & L_pick == 0 & R_pick > 0) { bias <- 1e+4 } else if (LR == "R" & R_pick == 0 & L_pick > 0) { bias <- 1e+4 } else { bias <- 0 } } ############################ [ bias value ] ################################# else if (!(is.na(x = pi)) & LR == "L") { bias <- pi * sqrt(log(L_pick + exp(1)) / (L_pick + 1e-10)) } else if (!(is.na(x = pi)) & LR == "R") { bias <- pi * sqrt(log(R_pick + exp(1)) / (R_pick + 1e-10)) } ############################## [ error ] #################################### else { bias <- "ERROR" } return(bias) } ## End(Not run)
The softmax function describes a probabilistic choice rule. It implies
that options with higher subjective values are chosen with a greater
probability, rather than deterministic. This probability of choosing
the higher-valued option increases with the parameter tau.
A higher tau indicates greater sensitivity to value differences, making
choices more deterministic.
func_tau( i, L_freq, R_freq, L_pick, R_pick, L_value, R_value, var1 = NA, var2 = NA, LR, try, tau, lapse, alpha, beta )func_tau( i, L_freq, R_freq, L_pick, R_pick, L_value, R_value, var1 = NA, var2 = NA, LR, try, tau, lapse, alpha, beta )
i |
[numeric] The current row number. |
L_freq |
[numeric] The frequency of left option appearance |
R_freq |
[numeric] The frequency of right option appearance |
L_pick |
[numeric] The number of times left option was picked |
R_pick |
[numeric] The number of times left option was picked |
L_value |
[numeric] The value of the left option with bias (if pi != 0) |
R_value |
[numeric] The value of the right option with bias (if pi != 0) |
var1 |
[character] Column name of extra variable 1. If your model uses more than just reward and expected value, and you need other information, such as whether the choice frame is Gain or Loss, then you can input the 'Frame' column as var1 into the model.
|
var2 |
[character] Column name of extra variable 2. If one additional variable, var1, does not meet your needs, you can add another additional variable, var2, into your model. default: |
LR |
[character] Are you calculating the probability for the left option or the right option?
|
try |
[numeric] If the choice was random, the value is 1; If the choice was based on value, the value is 0. |
tau |
[vector] Parameters used in the Soft-Max Function.
|
lapse |
[numeric] A numeric value between 0 and 1, representing the lapse rate. You can interpret this parameter as the probability of the agent "slipping" or making a random choice, irrespective of the learned action values. This accounts for moments of inattention or motor errors. In this sense, it represents the minimum probability with which any given option will be selected. It is a free parameter that acknowledges that individuals do not always make decisions with full concentration throughout an experiment. From a modeling perspective, the lapse rate is crucial for preventing the
log-likelihood calculation from returning
This ensures each option has a minimum selection probability of 1 percent in TAFC tasks. |
alpha |
[vector] Extra parameters that may be used in functions. |
beta |
[vector] Extra parameters that may be used in functions. |
The probability of choosing this option
When customizing these functions, please ensure that you do not modify
the arguments. Instead, only modify the if-else statements or
the internal logic to adapt the function to your needs.
## Not run: func_tau <- function( # Trial number i, # Number of times this option has appeared L_freq, R_freq, # Number of times this option has been chosen L_pick, R_pick, # Current value of this option L_value, R_value, # Extra variables var1 = NA, var2 = NA, # Whether calculating probability for left or right choice LR, # Is it a random choosing trial? try, # Free parameters tau, # Extra parameters alpha, beta ){ ############################### [ random ] ################################## if (try == 1) { prob <- 0.5 } ############################# [ greedy-max ] ################################ else if (try == 0 & LR == "L" & is.na(tau)) { if (L_value == R_value) { prob <- 0.5 } else if (L_value > R_value) { prob <- 1 } else if (L_value < R_value) { prob <- 0 } } else if (try == 0 & LR == "R" & is.na(tau)) { if (L_value == R_value) { prob <- 0.5 } else if (R_value > L_value) { prob <- 1 } else if (R_value < L_value) { prob <- 0 } } ############################### [ soft-max ] ################################ else if (try == 0 & LR == "L" & !(is.na(tau))) { prob <- 1 / (1 + exp(-(L_value - R_value) * tau)) } else if (try == 0 & LR == "R" & !(is.na(tau))) { prob <- 1 / (1 + exp(-(R_value - L_value) * tau)) } ################################ [ error ] ################################## else { prob <- "ERROR" } ################################ [ lapse ] ################################## prob <- (1 - lapse) * prob + (lapse / 2) return(prob) } ## End(Not run)## Not run: func_tau <- function( # Trial number i, # Number of times this option has appeared L_freq, R_freq, # Number of times this option has been chosen L_pick, R_pick, # Current value of this option L_value, R_value, # Extra variables var1 = NA, var2 = NA, # Whether calculating probability for left or right choice LR, # Is it a random choosing trial? try, # Free parameters tau, # Extra parameters alpha, beta ){ ############################### [ random ] ################################## if (try == 1) { prob <- 0.5 } ############################# [ greedy-max ] ################################ else if (try == 0 & LR == "L" & is.na(tau)) { if (L_value == R_value) { prob <- 0.5 } else if (L_value > R_value) { prob <- 1 } else if (L_value < R_value) { prob <- 0 } } else if (try == 0 & LR == "R" & is.na(tau)) { if (L_value == R_value) { prob <- 0.5 } else if (R_value > L_value) { prob <- 1 } else if (R_value < L_value) { prob <- 0 } } ############################### [ soft-max ] ################################ else if (try == 0 & LR == "L" & !(is.na(tau))) { prob <- 1 / (1 + exp(-(L_value - R_value) * tau)) } else if (try == 0 & LR == "R" & !(is.na(tau))) { prob <- 1 / (1 + exp(-(R_value - L_value) * tau)) } ################################ [ error ] ################################## else { prob <- "ERROR" } ################################ [ lapse ] ################################## prob <- (1 - lapse) * prob + (lapse / 2) return(prob) } ## End(Not run)
This dataset originates from Experiment 1 of Mason et al. (2024), titled "Rare and extreme outcomes in risky choice" (doi:10.3758/s13423-023-02415-x). The raw data is publicly available on the Open Science Framework (OSF) at https://osf.io/hy3q4/. For the purposes of this package, we've performed basic cleaning and preprocessing of the original dataset.
A data frame with 45000 rows and 11 columns:
Subject ID, an integer (total of 143).
Block number, an integer (1 to 6).
Trial number, an integer (1 to 60).
Left choice, a character indicating the option presented. The possible options are:
A: 100% gain 4.
B: 90% gain 0 and 10% gain 40.
C: 100% lose 4.
D: 90% lose 0 and 10% lose 40.
Right choice, a character indicating the option presented. The possible options are:
A: 100% gain 4.
B: 90% gain 0 and 10% gain 40.
C: 100% lose 4.
D: 90% lose 0 and 10% lose 40.
Reward associated with the left choice.
Reward associated with the right choice.
The chosen option, either L_choice or R_choice.
Type of frame, a character string (e.g., "Gain", "Loss", "Catch").
The participant's net worth at the end of each trial.
The participant's reaction time (in milliseconds) for each trial.
# Load the Mason_2024_G1 dataset data(binaryRL::Mason_2024_G1) head(binaryRL::Mason_2024_G1)# Load the Mason_2024_G1 dataset data(binaryRL::Mason_2024_G1) head(binaryRL::Mason_2024_G1)
This dataset originates from Experiment 2 of Mason et al. (2024), titled "Rare and extreme outcomes in risky choice" (doi:10.3758/s13423-023-02415-x). The raw data is publicly available on the Open Science Framework (OSF) at https://osf.io/hy3q4/. For the purposes of this package, we've performed basic cleaning and preprocessing of the original dataset.
A data frame with 45000 rows and 11 columns:
Subject ID, an integer (total of 143).
Block number, an integer (1 to 6).
Trial number, an integer (1 to 60).
Left choice, a character indicating the option presented. The possible options are:
A: 100% gain 36.
B: 90% gain 40 and 10% gain 0.
C: 100% lose 36.
D: 90% lose 40 and 10% lose 0.
Right choice, a character indicating the option presented. The possible options are:
A: 100% gain 36.
B: 90% gain 40 and 10% gain 0.
C: 100% lose 36.
D: 90% lose 40 and 10% lose 0.
Reward associated with the left choice.
Reward associated with the right choice.
The chosen option, either L_choice or R_choice.
Type of frame, a character string (e.g., "Gain", "Loss", "Catch").
The participant's net worth at the end of each trial.
The participant's reaction time (in milliseconds) for each trial.
# Load the Mason_2024_G2 dataset data(binaryRL::Mason_2024_G2) head(binaryRL::Mason_2024_G2)# Load the Mason_2024_G2 dataset data(binaryRL::Mason_2024_G2) head(binaryRL::Mason_2024_G2)
This is an internal helper function for fit_p. Its primary purpose
is to provide a unified interface for users to interact with various
optimization algorithm packages. It adapts the inputs and outputs
to be compatible with eight distinct algorithms, ensuring a seamless
experience regardless of the underlying solver used.
The function provides several optimization algorithms:
1. L-BFGS-B (from stats::optim)
2. Simulated Annealing (GenSA::GenSA)
3. Genetic Algorithm (GA::ga)
4. Differential Evolution (DEoptim::DEoptim)
5. Particle Swarm Optimization (pso::psoptim)
6. Bayesian Optimization (mlrMBO::mbo)
7. Covariance Matrix Adapting Evolutionary Strategy (cmaes::cma_es)
8. Nonlinear Optimization (nloptr::nloptr)
For more information, please refer to the homepage of this package: https://yuki-961004.github.io/binaryRL/
optimize_para( estimate = "MLE", policy = "off", data, id, n_trials, n_params, obj_func, lower, upper, priors = NULL, initial_params = NA, initial_size = 50, iteration = 10, seed = 123, algorithm )optimize_para( estimate = "MLE", policy = "off", data, id, n_trials, n_params, obj_func, lower, upper, priors = NULL, initial_params = NA, initial_size = 50, iteration = 10, seed = 123, algorithm )
estimate |
[string] Estimation method. Can be either
default: |
policy |
[character] Specifies the learning policy to be used.
This determines how the model updates action values based on observed or
simulated choices. It can be either
default: |
data |
[data.frame] This data should include the following mandatory columns:
|
id |
[character] Specifies the ID of the subject whose optimal parameters will be fitted. This parameter accepts either string or numeric values. The provided ID must correspond to an existing subject identifier within the raw dataset provided to the function. |
n_trials |
[integer] The total number of trials in your experiment. |
n_params |
[integer] The number of free parameters in your model. |
obj_func |
[function] The objective function that the optimization algorithm package accepts.
This function must strictly take only one argument, |
lower |
[vector] Lower bounds of free parameters |
upper |
[vector] Upper bounds of free parameters |
priors |
[List] A list specifying the prior distributions for the model parameters.
This argument is mandatory when using 1. Static MAP Estimation (Non-Hierarchical) This approach is used when you have a strong, pre-defined belief about the parameter priors and do not want the model to update them iteratively.
2. Hierarchical Bayesian Estimation via EM This approach is used to let the model learn the group-level (hierarchical) prior distributions directly from the data.
default: |
initial_params |
[vector] Initial values for the free parameters that the optimization algorithm will
search from. These are primarily relevant when using algorithms that require
an explicit starting point, such as default: |
initial_size |
[integer] This parameter corresponds to the population size in genetic
algorithms ( default: |
iteration |
[integer] The number of iterations the optimization algorithm will perform when searching for the best-fitting parameters during the fitting phase. A higher number of iterations may increase the likelihood of finding a global optimum but also increases computation time. |
seed |
[integer] Random seed. This ensures that the results are reproducible and remain the same each time the function is run. default: |
algorithm |
[character] Choose an algorithm package from
In addition, any algorithm from the |
the result of binaryRL with optimal parameters
## Not run: binaryRL.res <- binaryRL::optimize_para( data = binaryRL::Mason_2024_G2, id = 1, obj_func = binaryRL::RSTD, n_params = 3, n_trials = 360, lower = c(0, 0, 0), upper = c(1, 1, 1), iteration = 10, seed = 123, #algorithm = "L-BFGS-B" # Gradient-Based (stats) #algorithm = "GenSA" # Simulated Annealing (GenSA) #algorithm = "GA" # Genetic Algorithm (GA) #algorithm = "DEoptim" # Differential Evolution (DEoptim) #algorithm = "PSO" # Particle Swarm Optimization (pso) #algorithm = "Bayesian" # Bayesian Optimization (mlrMBO) #algorithm = "CMA-ES" # Covariance Matrix Adapting (cmaes) algorithm = c("NLOPT_GN_MLSL", "NLOPT_LN_BOBYQA") ) summary(binaryRL.res) ## End(Not run)## Not run: binaryRL.res <- binaryRL::optimize_para( data = binaryRL::Mason_2024_G2, id = 1, obj_func = binaryRL::RSTD, n_params = 3, n_trials = 360, lower = c(0, 0, 0), upper = c(1, 1, 1), iteration = 10, seed = 123, #algorithm = "L-BFGS-B" # Gradient-Based (stats) #algorithm = "GenSA" # Simulated Annealing (GenSA) #algorithm = "GA" # Genetic Algorithm (GA) #algorithm = "DEoptim" # Differential Evolution (DEoptim) #algorithm = "PSO" # Particle Swarm Optimization (pso) #algorithm = "Bayesian" # Bayesian Optimization (mlrMBO) #algorithm = "CMA-ES" # Covariance Matrix Adapting (cmaes) algorithm = c("NLOPT_GN_MLSL", "NLOPT_LN_BOBYQA") ) summary(binaryRL.res) ## End(Not run)
This function is designed for model and parameter recovery of user-created
(black-box) models, provided they conform to the specified interface.
(demo:
TD,
RSTD,
Utility
).
The process involves generating synthetic datasets. First, parameters
are randomly sampled within a defined range. These parameters are then
used to simulate artificial datasets.
Subsequently, all candidate models are used to fit these simulated datasets. Model recoverability is assessed if a synthetic dataset generated by Model A is consistently best fitted by Model A itself.
Furthermore, the function allows users to evaluate parameter recoverability. If, for instance, a synthetic dataset generated by Model A was based on parameters like 0.3 and 0.7, and Model A then recovers optimal parameters close to 0.3 and 0.7 from this data, it indicates that the parameters of Model A are recoverable.
The function provides several optimization algorithms:
1. L-BFGS-B (from stats::optim)
2. Simulated Annealing (GenSA::GenSA)
3. Genetic Algorithm (GA::ga)
4. Differential Evolution (DEoptim::DEoptim)
5. Particle Swarm Optimization (pso::psoptim)
6. Bayesian Optimization (mlrMBO::mbo)
7. Covariance Matrix Adapting Evolutionary Strategy (cmaes::cma_es)
8. Nonlinear Optimization (nloptr::nloptr)
For more information, please refer to the homepage of this package: https://yuki-961004.github.io/binaryRL/
rcv_d( estimate = "MLE", policy = "off", data, id = NULL, n_trials = NULL, funcs = NULL, model_names = c("TD", "RSTD", "Utility"), simulate_models = list(binaryRL::TD, binaryRL::RSTD, binaryRL::Utility), rfun = NULL, fit_models = list(binaryRL::TD, binaryRL::RSTD, binaryRL::Utility), dfun = NULL, lower = list(c(0, 0), c(0, 0, 0), c(0, 0, 0)), upper = list(c(1, 5), c(1, 1, 5), c(1, 1, 5)), initial_params = NA, initial_size = 50, tolerance = 0.001, seed = 123, iteration_s = 100, iteration_f = 100, nc = 1, algorithm )rcv_d( estimate = "MLE", policy = "off", data, id = NULL, n_trials = NULL, funcs = NULL, model_names = c("TD", "RSTD", "Utility"), simulate_models = list(binaryRL::TD, binaryRL::RSTD, binaryRL::Utility), rfun = NULL, fit_models = list(binaryRL::TD, binaryRL::RSTD, binaryRL::Utility), dfun = NULL, lower = list(c(0, 0), c(0, 0, 0), c(0, 0, 0)), upper = list(c(1, 5), c(1, 1, 5), c(1, 1, 5)), initial_params = NA, initial_size = 50, tolerance = 0.001, seed = 123, iteration_s = 100, iteration_f = 100, nc = 1, algorithm )
estimate |
[string] Estimation method. Can be either
default: |
policy |
[string] Specifies the learning policy to be used.
This determines how the model updates action values based on observed or
simulated choices. It can be either
default: |
data |
[data.frame] This data should include the following mandatory columns:
|
id |
[CharacterVector] Specifies which subject's data to use. In parameter and model recovery analyses, the specific subject ID is often irrelevant. Although the experimental trial order might have some randomness for each subject, the sequence of reward feedback is typically pseudo-random. The default value for this argument is default: |
n_trials |
[integer] Represents the total number of trials a single subject experienced
in the experiment. If this parameter is kept at its default value
of default: |
funcs |
[CharacterVector] A character vector containing the names of all user-defined functions
required for the computation. When parallel computation is enabled
(i.e., Therefore, if you have created your own reinforcement learning model
that modifies the package's default six default functions
(default functions:
|
model_names |
[List] The names of fit modals e.g. |
simulate_models |
[List] A list of functions used to simulate data from different models. Each function in the list should represent one model. e.g. |
rfun |
[List] A nested list of functions used to generate random parameter values for simulation. The top-level elements of the list should be named according to the models. Each of these elements must be a named list of functions, where each name corresponds to a model parameter and its value is the random number generation function. e.g., |
fit_models |
[List] A list of functions used to fit different models to the data. Each function in the list should represent one model. e.g. |
dfun |
[List] A nested list that defines the probability density/mass functions (PDF/PMF) for each model's parameters. The top-level names of the list must match the model names. Each element must be another named list, where each name corresponds to a model parameter and its value is the probability density function. e.g., |
lower |
[List] The lower bounds of models' free parameters. e.g. |
upper |
[List] The upper bounds of models' free parameters. e.g. |
initial_params |
[NumericVector] Initial values for the free parameters that the optimization algorithm will
search from. These are primarily relevant when using algorithms that require
an explicit starting point, such as default: |
initial_size |
[integer] This parameter corresponds to the population size in genetic
algorithms ( default: |
tolerance |
[double] Convergence threshold for MAP estimation. If the change in log posterior probability between iterations is smaller than this value, the algorithm is considered to have converged and the program will stop. default: |
seed |
[integer] Random seed. This ensures that the results are reproducible and remain the same each time the function is run. default: |
iteration_s |
[integer] This parameter determines how many simulated datasets are created for subsequent model and parameter recovery analyses. default: |
iteration_f |
[NumericVector] The number of iterations for the optimization algorithm. The required format depends on the estimation method used.
A higher number of iterations may increase the likelihood of finding a global optimum but also increases computation time. default: |
nc |
[integer] Number of cores to use for parallel processing. Since fitting optimal parameters for each subject is an independent task, parallel computation can significantly speed up the fitting process:
|
algorithm |
[string] Choose an algorithm package from
In addition, any algorithm from the e.g. |
A list where each element is a data.frame. Each data.frame within this list records the results of fitting synthetic data (generated by Model A) with Model B.
While both fit_p and rcv_d utilize the same underlying
optimize_para function to find optimal parameters, they play
distinct and sequential roles in the modeling pipeline.
The key differences are as follows:
Purpose and Data Source: rcv_d should always be
performed before fit_p. Its primary role is to validate a
model's stability by fitting it to synthetic data generated by the
model itself. This process, known as parameter recovery, ensures the
model is well-behaved. In contrast, fit_p is used in the
subsequent stage to fit the validated model to real experimental
data.
Estimation Method: rcv_d does not include an
estimate argument. This is because the synthetic data is
generated from known "true" parameters, which are drawn from
pre-defined distributions (typically uniform for most parameters and
exponential for the inverse temperature). Since the ground truth is
known, a hierarchical estimation (MAP) is not applicable. The
fit_p function, however, requires this argument to handle
real data where the true parameters are unknown.
Policy Setting: In fit_p, the policy
setting has different effects: "on-policy" is better for learning
choice patterns, while "off-policy" yields more accurate parameter
estimates. For rcv_d, the process defaults to an "off-policy"
approach because its main objectives are to verify if the true
parameters can be accurately recovered and to assess whether competing
models are distinguishable, tasks for which off-policy estimation is
more suitable.
## Not run: recovery <- binaryRL::rcv_d( data = binaryRL::Mason_2024_G2, #+-----------------------------------------------------------------------------+# #|----------------------------- black-box function ----------------------------|# funcs = c("your_funcs"), estimate = c("MLE", "MAP"), policy = c("off", "on"), model_names = c("TD", "RSTD", "Utility"), #|------------------------------- simulate models -----------------------------|# simulate_models = list(binaryRL::TD, binaryRL::RSTD, binaryRL::Utility), rfun = list( list( eta = function() { stats::runif(n = 1, min = 0, max = 1) }, tau = function() { stats::rexp(n = 1, rate = 1) } ), list( etan = function() { stats::runif(n = 1, min = 0, max = 1) }, etap = function() { stats::runif(n = 1, min = 0, max = 1) }, tau = function() { stats::rexp(n = 1, rate = 1) } ), list( eta = function() { stats::runif(n = 1, min = 0, max = 1) }, gamma = function() { stats::runif(n = 1, min = 0, max = 1) }, tau = function() { stats::rexp(n = 1, rate = 1) } ), ), #|---------------------------------- fit models -------------------------------|# fit_models = list(binaryRL::TD, binaryRL::RSTD, binaryRL::Utility), dfun = list( list( eta = function(x) { stats::dunif(x, min = 0, max = 1, log = TRUE) }, tau = function(x) { stats::dexp(x, rate = 1, log = TRUE) } ), list( etan = function(x) { stats::dunif(x, min = 0, max = 1, log = TRUE) }, etap = function(x) { stats::dunif(x, min = 0, max = 1, log = TRUE) }, tau = function(x) { stats::dexp(x, rate = 1, log = TRUE) } ), list( eta = function(x) { stats::dunif(x, min = 0, max = 1, log = TRUE) }, gamma = function(x) { stats::dunif(x, min = 0, max = 1, log = TRUE) }, tau = function(x) { stats::dexp(x, rate = 1, log = TRUE) } ), ), #|---------------------------------- bound ------------------------------------|# lower = list(c(0, 0), c(0, 0, 0), c(0, 0, 0)), upper = list(c(1, 5), c(1, 1, 5), c(1, 1, 5)), #|----------------------------- interation number -----------------------------|# iteration_s = 100, iteration_f = c(100, 10), #|-------------------------------- algorithms ---------------------------------|# nc = 1, # <nc > 1>: parallel computation across subjects # Base R Optimization #algorithm = "L-BFGS-B" # Gradient-Based (stats) #|-----------------------------------------------------------------------------|# # Specialized External Optimization #algorithm = "GenSA" # Simulated Annealing (GenSA) #algorithm = "GA" # Genetic Algorithm (GA) #algorithm = "DEoptim" # Differential Evolution (DEoptim) #algorithm = "PSO" # Particle Swarm Optimization (pso) #algorithm = "Bayesian" # Bayesian Optimization (mlrMBO) #algorithm = "CMA-ES" # Covariance Matrix Adapting (cmaes) #|-----------------------------------------------------------------------------|# # Optimization Library (nloptr) algorithm = c("NLOPT_GN_MLSL", "NLOPT_LN_BOBYQA") #|-------------------------------- algorithms ---------------------------------|# #+#############################################################################+# ) result <- dplyr::bind_rows(recovery) %>% dplyr::select(simulate_model, fit_model, iteration, everything()) # Ensure the output directory exists if (!dir.exists("../OUTPUT")) { dir.create("../OUTPUT", recursive = TRUE) } write.csv(result, file = "../OUTPUT/result_recovery.csv", row.names = FALSE) ## End(Not run)## Not run: recovery <- binaryRL::rcv_d( data = binaryRL::Mason_2024_G2, #+-----------------------------------------------------------------------------+# #|----------------------------- black-box function ----------------------------|# funcs = c("your_funcs"), estimate = c("MLE", "MAP"), policy = c("off", "on"), model_names = c("TD", "RSTD", "Utility"), #|------------------------------- simulate models -----------------------------|# simulate_models = list(binaryRL::TD, binaryRL::RSTD, binaryRL::Utility), rfun = list( list( eta = function() { stats::runif(n = 1, min = 0, max = 1) }, tau = function() { stats::rexp(n = 1, rate = 1) } ), list( etan = function() { stats::runif(n = 1, min = 0, max = 1) }, etap = function() { stats::runif(n = 1, min = 0, max = 1) }, tau = function() { stats::rexp(n = 1, rate = 1) } ), list( eta = function() { stats::runif(n = 1, min = 0, max = 1) }, gamma = function() { stats::runif(n = 1, min = 0, max = 1) }, tau = function() { stats::rexp(n = 1, rate = 1) } ), ), #|---------------------------------- fit models -------------------------------|# fit_models = list(binaryRL::TD, binaryRL::RSTD, binaryRL::Utility), dfun = list( list( eta = function(x) { stats::dunif(x, min = 0, max = 1, log = TRUE) }, tau = function(x) { stats::dexp(x, rate = 1, log = TRUE) } ), list( etan = function(x) { stats::dunif(x, min = 0, max = 1, log = TRUE) }, etap = function(x) { stats::dunif(x, min = 0, max = 1, log = TRUE) }, tau = function(x) { stats::dexp(x, rate = 1, log = TRUE) } ), list( eta = function(x) { stats::dunif(x, min = 0, max = 1, log = TRUE) }, gamma = function(x) { stats::dunif(x, min = 0, max = 1, log = TRUE) }, tau = function(x) { stats::dexp(x, rate = 1, log = TRUE) } ), ), #|---------------------------------- bound ------------------------------------|# lower = list(c(0, 0), c(0, 0, 0), c(0, 0, 0)), upper = list(c(1, 5), c(1, 1, 5), c(1, 1, 5)), #|----------------------------- interation number -----------------------------|# iteration_s = 100, iteration_f = c(100, 10), #|-------------------------------- algorithms ---------------------------------|# nc = 1, # <nc > 1>: parallel computation across subjects # Base R Optimization #algorithm = "L-BFGS-B" # Gradient-Based (stats) #|-----------------------------------------------------------------------------|# # Specialized External Optimization #algorithm = "GenSA" # Simulated Annealing (GenSA) #algorithm = "GA" # Genetic Algorithm (GA) #algorithm = "DEoptim" # Differential Evolution (DEoptim) #algorithm = "PSO" # Particle Swarm Optimization (pso) #algorithm = "Bayesian" # Bayesian Optimization (mlrMBO) #algorithm = "CMA-ES" # Covariance Matrix Adapting (cmaes) #|-----------------------------------------------------------------------------|# # Optimization Library (nloptr) algorithm = c("NLOPT_GN_MLSL", "NLOPT_LN_BOBYQA") #|-------------------------------- algorithms ---------------------------------|# #+#############################################################################+# ) result <- dplyr::bind_rows(recovery) %>% dplyr::select(simulate_model, fit_model, iteration, everything()) # Ensure the output directory exists if (!dir.exists("../OUTPUT")) { dir.create("../OUTPUT", recursive = TRUE) } write.csv(result, file = "../OUTPUT/result_recovery.csv", row.names = FALSE) ## End(Not run)
This function processes the synthetic datasets generated by
simulate_list(). For each of these simulated datasets, it then
fits every model specified within the fit_model list. In essence,
it iteratively calls the optimize_para() function for each
generated object.
The fitting procedure is analogous to that performed by fit_p,
and it similarly leverages parallel computation across subjects
to significantly accelerate the parameter estimation process.
recovery_data( policy, estimate, list, id = 1, n_trials, n_params, funcs = NULL, model_name, fit_model, dfun, lower, upper, initial_params = NA, initial_size = 50, tolerance, seed = 123, iteration, nc = 1, algorithm )recovery_data( policy, estimate, list, id = 1, n_trials, n_params, funcs = NULL, model_name, fit_model, dfun, lower, upper, initial_params = NA, initial_size = 50, tolerance, seed = 123, iteration, nc = 1, algorithm )
policy |
[character] Specifies the learning policy to be used.
This determines how the model updates action values based on observed or
simulated choices. It can be either
default: |
estimate |
[string] Estimation method. Can be either
default: |
list |
[list] A list generated by function |
id |
[vector] Specifies which subject's data to use. In parameter and model recovery analyses, the specific subject ID is often irrelevant. Although the experimental trial order might have some randomness for each subject, the sequence of reward feedback is typically pseudo-random. The default value for this argument is default: |
n_trials |
[integer] The total number of trials in your experiment. |
n_params |
[integer] The number of free parameters in your model. |
funcs |
[character] A character vector containing the names of all user-defined functions required for the computation. When parallel computation is enabled (i.e., 'nc > 1'), user-defined models and their custom functions might not be automatically accessible within the parallel environment. Therefore, if you have created your own reinforcement learning model
that modifies the package's default four default functions
(default functions:
|
model_name |
[character] The name of modal |
fit_model |
[function] fit model object function |
dfun |
[List] A nested list that defines the probability density/mass functions (PDF/PMF) for each model's parameters. The top-level names of the list must match the model names. Each element must be another named list, where each name corresponds to a model parameter and its value is the probability density function. e.g., |
lower |
[List] The lower bounds of model's free parameters. e.g. |
upper |
[List] The upper bounds of model's free parameters. e.g. |
initial_params |
[numeric] Initial values for the free parameters that the optimization algorithm will
search from. These are primarily relevant when using algorithms that require
an explicit starting point, such as default: |
initial_size |
[integer] This parameter corresponds to the population size in genetic
algorithms ( default: |
tolerance |
[double] Convergence threshold for MAP estimation. If the change in log posterior probability between iterations is smaller than this value, the algorithm is considered to have converged and the program will stop. default: |
seed |
[integer] Random seed. This ensures that the results are reproducible and remain the same each time the function is run. default: |
iteration |
[integer] The number of iterations the optimization algorithm will perform when searching for the best-fitting parameters during the fitting phase. A higher number of iterations may increase the likelihood of finding a global optimum but also increases computation time. |
nc |
[integer] Number of cores to use for parallel processing. Since fitting optimal parameters for each subject is an independent task, parallel computation can significantly speed up the fitting process:
default: |
algorithm |
[character]
Choose an algorithm package from
In addition, any algorithm from the |
a data frame for parameter recovery and model recovery
## Not run: df_recovery <- recovery_data( list = list_simulated, policy = "off", estimate = "MAP", model_name = "RSTD", fit_model = binaryRL::RSTD, dfun = list( etan = function(x) { stats::dunif(x, min = 0, max = 1, log = TRUE) }, etap = function(x) { stats::dunif(x, min = 0, max = 1, log = TRUE) }, tau = function(x) { stats::dexp(x, rate = 1, log = TRUE) } ), lower = c(0, 0, 0), upper = c(1, 1, 10), iteration = c(10, 3), nc = 10, algorithm = c("NLOPT_GN_MLSL", "NLOPT_LN_BOBYQA") ) ## End(Not run)## Not run: df_recovery <- recovery_data( list = list_simulated, policy = "off", estimate = "MAP", model_name = "RSTD", fit_model = binaryRL::RSTD, dfun = list( etan = function(x) { stats::dunif(x, min = 0, max = 1, log = TRUE) }, etap = function(x) { stats::dunif(x, min = 0, max = 1, log = TRUE) }, tau = function(x) { stats::dexp(x, rate = 1, log = TRUE) } ), lower = c(0, 0, 0), upper = c(1, 1, 10), iteration = c(10, 3), nc = 10, algorithm = c("NLOPT_GN_MLSL", "NLOPT_LN_BOBYQA") ) ## End(Not run)
After completing Step 3 using fit_p() to obtain the optimal parameters
for each subject and saving the resulting CSV locally, this function
allows you to load that result dataset. It then applies these optimal
parameters back into the reinforcement learning model, effectively
simulating how the "robot" (the model) would make its choices.
Based on this generated dataset, you can then analyze the robot's data in the same manner as you would analyze human behavioral data. If a particular model's fitted data can successfully reproduce the experimental effects observed in human subjects, it strongly suggests that this model is a good and valid representation of the process.
rpl_e( data, id = NULL, result, model, model_name, param_prefix = "param_", n_trials = NULL )rpl_e( data, id = NULL, result, model, model_name, param_prefix = "param_", n_trials = NULL )
data |
[data.frame] This data should include the following mandatory columns:
|
id |
[CharacterVector] A vector specifying the subject ID(s) for which parameters should be fitted. The function will process only the subjects provided in this vector. To fit all subjects, you can either explicitly set the argument as
It is strongly recommended to avoid using simple numeric sequences like
default: |
result |
[data.frame] Output data generated by the |
model |
[Function] A model function to be applied in evaluating the experimental effect. |
model_name |
[string] A character string specifying the name of the model to extract from the result. |
param_prefix |
[string] A prefix string used to identify parameter columns in the result data default: |
n_trials |
[integer] Represents the total number of trials a single subject experienced
in the experiment. If this parameter is kept at its default value
of default: |
A list, where each element is a data.frame representing one subject's
results. Each data.frame includes the value update history for each option,
the learning rate (eta), utility function (gamma), and other
relevant information used in each update.
## Not run: list <- list() list[[1]] <- dplyr::bind_rows( binaryRL::rpl_e( data = binaryRL::Mason_2024_G2, result = read.csv("../OUTPUT/result_comparison.csv"), model = binaryRL::TD, model_name = "TD" ) ) list[[2]] <- dplyr::bind_rows( binaryRL::rpl_e( data = binaryRL::Mason_2024_G2, result = read.csv("../OUTPUT/result_comparison.csv"), model = binaryRL::RSTD, model_name = "RSTD" ) ) list[[3]] <- dplyr::bind_rows( binaryRL::rpl_e( data = binaryRL::Mason_2024_G2, result = read.csv("../OUTPUT/result_comparison.csv"), model = binaryRL::Utility, model_name = "Utility" ) ) ## End(Not run)## Not run: list <- list() list[[1]] <- dplyr::bind_rows( binaryRL::rpl_e( data = binaryRL::Mason_2024_G2, result = read.csv("../OUTPUT/result_comparison.csv"), model = binaryRL::TD, model_name = "TD" ) ) list[[2]] <- dplyr::bind_rows( binaryRL::rpl_e( data = binaryRL::Mason_2024_G2, result = read.csv("../OUTPUT/result_comparison.csv"), model = binaryRL::RSTD, model_name = "RSTD" ) ) list[[3]] <- dplyr::bind_rows( binaryRL::rpl_e( data = binaryRL::Mason_2024_G2, result = read.csv("../OUTPUT/result_comparison.csv"), model = binaryRL::Utility, model_name = "Utility" ) ) ## End(Not run)
RSTD(params)RSTD(params)
params |
[vector] algorithm packages accept only one argument |
loss [numeric]
algorithm packages accept only one return
## Not run: RSTD <- function(params) { res <- binaryRL::run_m( data = data, id = id, eta = c(params[1], params[2]), tau = c(params[3]), priors = priors, n_params = n_params, n_trials = n_trials, mode = mode, policy = policy ) assign(x = "binaryRL.res", value = res, envir = binaryRL.env) loss <- switch(EXPR = estimate, "MLE" = -res$ll, "MAP" = -res$lpo) switch(EXPR = mode, "fit" = loss, "simulate" = res, "replay" = res) } ## End(Not run)## Not run: RSTD <- function(params) { res <- binaryRL::run_m( data = data, id = id, eta = c(params[1], params[2]), tau = c(params[3]), priors = priors, n_params = n_params, n_trials = n_trials, mode = mode, policy = policy ) assign(x = "binaryRL.res", value = res, envir = binaryRL.env) loss <- switch(EXPR = estimate, "MLE" = -res$ll, "MAP" = -res$lpo) switch(EXPR = mode, "fit" = loss, "simulate" = res, "replay" = res) } ## End(Not run)
This function is designed to construct and customize reinforcement learning models.
Items for model construction:
Data Input and Specification: You must provide the raw
dataset for analysis. Crucially, you need to inform the
run_m function about the corresponding column
names within your dataset
(e.g.,
Mason_2024_G1,
Mason_2024_G2
).
This is a game, so it's critical that your dataset includes rewards
for both the human-chosen option and the unchosen options.
Customizable RL Models: This function allows you to define and adjust the number of free parameters to create various reinforcement learning models.
Value Function:
Learning Rate:
By adjusting the number of eta, you can construct basic
reinforcement learning models such as Temporal Difference (TD)
and Risk Sensitive Temporal Difference (RSTD).
You can also directly adjust func_eta to
define your own custom learning rate function.
Utility Function: You can directly adjust the form
of func_gamma to incorporate the
principles of Kahneman's Prospect Theory. Currently,
the built-in func_gamma only takes the
form of a power function, consistent with Stevens' Power Law.
Exploration-Exploitation Trade-off:
Initial Values: This involves setting the initial expected value for each option when it hasn't been chosen yet. A higher initial value encourages exploration.
Epsilon: Adjusting the threshold,
epsilon and lambda parameters can lead to
exploration strategies such as epsilon-first, epsilon-greedy,
or epsilon-decreasing.
Upper-Confidence-Bound: By adjusting pi,
it controls the degree of exploration by scaling the uncertainty
bonus given to less-explored options.
Soft-Max: By adjusting the inverse temperature
parameter tau, this controls the agent's sensitivity to
value differences. A higher value of tau means greater emphasis
on value differences, leading to more exploitation. A smaller
value of tau indicates a greater tendency towards exploration.
Objective Function Format for Optimization: Once your
model is defined in run_m, it must be structured
as an objective function that accepts params as input and returns
a loss value (typically logL). This format ensures compatibility
with the algorithm package, which uses it to estimate optimal
parameters. For an example of a standard objective function format, see
TD,
RSTD,
Utility.
For more information, please refer to the homepage of this package: https://yuki-961004.github.io/binaryRL/
run_m( name = NA, mode = c("simulate", "fit", "replay"), policy = c("on", "off"), data, id, n_params, n_trials, gamma = 1, eta, initial_value = NA_real_, threshold = 1, epsilon = NA, lambda = NA, pi = NA, tau = NA, lapse = 0.02, alpha = NA, beta = NA, priors = NULL, util_func = func_gamma, rate_func = func_eta, expl_func = func_epsilon, bias_func = func_pi, prob_func = func_tau, loss_func = func_logl, sub = "Subject", time_line = c("Block", "Trial"), L_choice = "L_choice", R_choice = "R_choice", L_reward = "L_reward", R_reward = "R_reward", sub_choose = "Sub_Choose", rob_choose = "Rob_Choose", raw_cols = NULL, var1 = NA_character_, var2 = NA_character_, seed = 123, digits_1 = 2, digits_2 = 5, engine = "cpp" )run_m( name = NA, mode = c("simulate", "fit", "replay"), policy = c("on", "off"), data, id, n_params, n_trials, gamma = 1, eta, initial_value = NA_real_, threshold = 1, epsilon = NA, lambda = NA, pi = NA, tau = NA, lapse = 0.02, alpha = NA, beta = NA, priors = NULL, util_func = func_gamma, rate_func = func_eta, expl_func = func_epsilon, bias_func = func_pi, prob_func = func_tau, loss_func = func_logl, sub = "Subject", time_line = c("Block", "Trial"), L_choice = "L_choice", R_choice = "R_choice", L_reward = "L_reward", R_reward = "R_reward", sub_choose = "Sub_Choose", rob_choose = "Rob_Choose", raw_cols = NULL, var1 = NA_character_, var2 = NA_character_, seed = 123, digits_1 = 2, digits_2 = 5, engine = "cpp" )
name |
[string] The name of your RL model |
mode |
[string] This parameter controls the function's operational mode. It has three possible values, each typically associated with a specific function:
In most cases, you won't need to modify this parameter directly, as suitable default values are set for different contexts. |
policy |
[string] Specifies the learning policy to be used.
This determines how the model updates action values based on observed or
simulated choices. It can be either
|
data |
[data.frame] This data should include the following mandatory columns:
|
id |
[string] Which subject is going to be analyzed. The value should correspond to an entry in the "sub" column, which must contain the subject IDs. e.g. |
n_params |
[integer] The number of free parameters in your model. |
n_trials |
[integer] The total number of trials in your experiment. |
gamma |
[NumericVector] Note: This should not be confused with the discount rate parameter
(also named gamma) found in Temporal Difference (TD) models.
Rescorla-Wagner model does not include a discount rate.
Here,
default: |
eta |
[NumericVector] Parameters used in the Learning Rate Function,
The structure of
TD: RSTD: |
initial_value |
[double] Subject's initial expected value for each stimulus's reward. If this value
is not set default: |
threshold |
[integer] Controls the initial exploration phase in the epsilon-first strategy.
This is the number of early trials where the subject makes purely random
choices, as they haven't yet learned the options' values. For example,
default: epsilon-first: |
epsilon |
[NumericVector] A parameter used in the epsilon-greedy exploration strategy. It
defines the probability of making a completely random choice, as opposed
to choosing based on the relative values of the left and right options.
For example, if
epsilon-greedy: |
lambda |
[NumericVector] A numeric value that controls the decay rate of exploration probability
in the epsilon-decreasing strategy. A higher
epsilon-decreasing: |
pi |
[NumericVector] Parameter used in the Upper-Confidence-Bound (UCB) action selection
formula.
default: |
tau |
[NumericVector] Parameters used in the Soft-Max Function.
default |
lapse |
[double] A numeric value between 0 and 1, representing the lapse rate. You can interpret this parameter as the probability of the agent "slipping" or making a random choice, irrespective of the learned action values. This accounts for moments of inattention or motor errors. In this sense, it represents the minimum probability with which any given option will be selected. It is a free parameter that acknowledges that individuals do not always make decisions with full concentration throughout an experiment. From a modeling perspective, the lapse rate is crucial for preventing the
log-likelihood calculation from returning
default: This ensures each option has a minimum selection probability of 1 percent in TAFC tasks. |
alpha |
[NumericVector] Extra parameters that may be used in functions. |
beta |
[NumericVector] Extra parameters that may be used in functions. |
priors |
[list] A list specifying the prior distributions for the model parameters.
This argument is mandatory when using default: |
util_func |
[Function] Utility Function see |
rate_func |
[Function] Learning Rate Function see |
expl_func |
[Function] Exploration Strategy Function see |
bias_func |
[Function] Upper-Confidence-Bound see |
prob_func |
[Function] Soft-Max Function see |
loss_func |
[Function] Loss Function see |
sub |
[string] Column name of subject ID e.g. |
time_line |
[CharacterVector] A vector specifying the name of the column that the sequence of the experiment. This argument defines how the experiment is structured, such as whether it is organized by "Block" with breaks in between, and multiple trials within each block. default: |
L_choice |
[string] Column name of left choice. default: |
R_choice |
[string] Column name of right choice. default: |
L_reward |
[string] Column name of the reward of left choice default: |
R_reward |
[string] Column name of the reward of right choice default: |
sub_choose |
[string] Column name of choices made by the subject. default: |
rob_choose |
[string] Column name of choices made by the model, which you could ignore. default: |
raw_cols |
[CharacterVector] Defaults to |
var1 |
[string] Column name of extra variable 1. If your model uses more than just reward and expected value, and you need other information, such as whether the choice frame is Gain or Loss, then you can input the 'Frame' column as var1 into the model. default: |
var2 |
[string] Column name of extra variable 2. If one additional variable, var1, does not meet your needs, you can add another additional variable, var2, into your model. default: |
seed |
[integer] Random seed. This ensures that the results are reproducible and remain the same each time the function is run. default: |
digits_1 |
[integer] The number of decimal places to retain for columns related to value function default: |
digits_2 |
[integer] The number of decimal places to retain for columns related to select function. default: |
engine |
[string] - - default: |
A list of class binaryRL containing the results of the model fitting.
data <- binaryRL::Mason_2024_G2 binaryRL.res <- binaryRL::run_m( mode = "replay", data = data, id = 18, eta = c(0.321, 0.765), tau = 0.5, n_params = 3, n_trials = 360 ) summary(binaryRL.res)data <- binaryRL::Mason_2024_G2 binaryRL.res <- binaryRL::run_m( mode = "replay", data = data, id = 18, eta = c(0.321, 0.765), tau = 0.5, n_params = 3, n_trials = 360 ) summary(binaryRL.res)
This function generates random input parameters for a model based on
user-specified distributions. For example, if the first parameter,
eta, is set to follow a uniform distribution from 0 to 1, its
values will be randomly sampled from U(0, 1).
You can also specify parameters to be drawn from a normal distribution.
For example,
eta = function() { stats::rnorm(n = 1, mean = 0.5, sd = 0.1) }
.
Make sure the last parameter, which typically represents the inverse
temperature parameter in the soft-max function, is sampled from an
exponential distribution.
simulate_list( data, id = 1, n_params, n_trials, obj_func, rfun, iteration = 10, seed = 123 )simulate_list( data, id = 1, n_params, n_trials, obj_func, rfun, iteration = 10, seed = 123 )
data |
[data.frame] This data should include the following mandatory columns:
|
id |
[vector] Specifies which subject's data to use. In parameter and model recovery analyses, the specific subject ID is often irrelevant. Although the experimental trial order might have some randomness for each subject, the sequence of reward feedback is typically pseudo-random. The default value for this argument is default: |
n_params |
[integer] The number of free parameters in your model. |
n_trials |
[integer] The total number of trials in your experiment. |
obj_func |
[function] The objective function that the optimization algorithm package accepts.
This function must strictly take only one argument, |
rfun |
[List] A nested list of functions used to generate random parameter values for simulation. The top-level elements of the list should be named according to the models. Each of these elements must be a named list of functions, where each name corresponds to a model parameter and its value is the random number generation function. e.g., |
iteration |
[integer] This parameter determines how many simulated datasets are created for subsequent model and parameter recovery analyses. default: |
seed |
[integer] Random seed. This ensures that the results are reproducible and remain the same each time the function is run. default: |
a list with fake data generated by random free parameters
## Not run: list_simulated <- binaryRL::simulate_list( data = binaryRL::Mason_2024_G2, obj_func = binaryRL::RSTD, n_params = 3, n_trials = 360, rfun = list( etan = function() { stats::runif(n = 1, min = 0, max = 1) }, etap = function() { stats::runif(n = 1, min = 0, max = 1) }, tau = function() { stats::rexp(n = 1, rate = 1) } ), iteration = 10 ) ## End(Not run)## Not run: list_simulated <- binaryRL::simulate_list( data = binaryRL::Mason_2024_G2, obj_func = binaryRL::RSTD, n_params = 3, n_trials = 360, rfun = list( etan = function() { stats::runif(n = 1, min = 0, max = 1) }, etap = function() { stats::runif(n = 1, min = 0, max = 1) }, tau = function() { stats::rexp(n = 1, rate = 1) } ), iteration = 10 ) ## End(Not run)
S3method summary
## S3 method for class 'binaryRL' summary(object, ...)## S3 method for class 'binaryRL' summary(object, ...)
object |
binaryRL result |
... |
others |
summary
TD(params)TD(params)
params |
[vector] algorithm packages accept only one argument |
loss [numeric]
algorithm packages accept only one return
## Not run: TD <- function(params) { res <- binaryRL::run_m( data = data, id = id, eta = c(params[1]), tau = c(params[2]), priors = priors, n_params = n_params, n_trials = n_trials, mode = mode, policy = policy ) assign(x = "binaryRL.res", value = res, envir = binaryRL.env) loss <- switch(EXPR = estimate, "MLE" = -res$ll, "MAP" = -res$lpo) switch(EXPR = mode, "fit" = loss, "simulate" = res, "replay" = res) } ## End(Not run)## Not run: TD <- function(params) { res <- binaryRL::run_m( data = data, id = id, eta = c(params[1]), tau = c(params[2]), priors = priors, n_params = n_params, n_trials = n_trials, mode = mode, policy = policy ) assign(x = "binaryRL.res", value = res, envir = binaryRL.env) loss <- switch(EXPR = estimate, "MLE" = -res$ll, "MAP" = -res$lpo) switch(EXPR = mode, "fit" = loss, "simulate" = res, "replay" = res) } ## End(Not run)
Utility(params)Utility(params)
params |
[vector] algorithm packages accept only one argument |
loss [numeric]
algorithm packages accept only one return
## Not run: Utility <- function(params) { res <- binaryRL::run_m( data = data, id = id, eta = c(params[1]), gamma = c(params[2]), tau = c(params[3]), priors = priors, n_params = n_params, n_trials = n_trials, mode = mode, policy = policy ) assign(x = "binaryRL.res", value = res, envir = binaryRL.env) loss <- switch(EXPR = estimate, "MLE" = -res$ll, "MAP" = -res$lpo) switch(EXPR = mode, "fit" = loss, "simulate" = res, "replay" = res) } ## End(Not run)## Not run: Utility <- function(params) { res <- binaryRL::run_m( data = data, id = id, eta = c(params[1]), gamma = c(params[2]), tau = c(params[3]), priors = priors, n_params = n_params, n_trials = n_trials, mode = mode, policy = policy ) assign(x = "binaryRL.res", value = res, envir = binaryRL.env) loss <- switch(EXPR = estimate, "MLE" = -res$ll, "MAP" = -res$lpo) switch(EXPR = mode, "fit" = loss, "simulate" = res, "replay" = res) } ## End(Not run)