Title: | Builds General Noise SVRs |
---|---|
Description: | Builds general noise SVR models using Naive Online R Minimization Algorithm, NORMA, an optimization method based on classical stochastic gradient descent suitable for computing SVR models in an online setting. |
Authors: | Jesus Prada [aut,cre] |
Maintainer: | Jesus Prada <[email protected]> |
License: | GPL-2 |
Version: | 0.1 |
Built: | 2024-11-08 02:55:51 UTC |
Source: | https://github.com/cran/NORMA |
Computes the predictor function of a general noise SVR based on NORMA optimization.
f(point, t, x, alpha, beta, f_0, kernel = function(x, y, gamma) { exp(-gamma * (norm(x - y, type = "2")^2)) }, gamma, no_beta)
f(point, t, x, alpha, beta, f_0, kernel = function(x, y, gamma) { exp(-gamma * (norm(x - y, type = "2")^2)) }, gamma, no_beta)
point |
|
t |
time parameter value indicating the iteration we want to consider. |
x |
|
alpha |
|
beta |
|
f_0 |
initial hypothesis. |
kernel |
kernel function to use. |
gamma |
gaussian kernel parameter |
no_beta |
|
Returns a numeric
representing the prediction value.
Jesus Prada, [email protected]
Link to the scientific paper
Kivinen J., Smola A. J., Williamson R.C.: Online learning with kernels. In: IEEE transactions on signal processing, vol. 52, pp. 2165-2176, IEEE (2004).
with theoretical background for NORMA optimization is provided below.
http://realm.sics.se/papers/KivSmoWil04(1).pdf
f(c(1,2,3),2,matrix(c(1,2,3,4,5,6),nrow=2,ncol=3,byrow=TRUE), matrix(c(1,2,3,4,5,6),nrow=2,ncol=3,byrow=TRUE), c(1,2),0,function(x,y,gamma=0){x%*%y},0.1,FALSE)
f(c(1,2,3),2,matrix(c(1,2,3,4,5,6),nrow=2,ncol=3,byrow=TRUE), matrix(c(1,2,3,4,5,6),nrow=2,ncol=3,byrow=TRUE), c(1,2),0,function(x,y,gamma=0){x%*%y},0.1,FALSE)
ILF_cost_der
computes the ILF derivative value at a given point.
zero_laplace_cost_der
computes the value at a given point of the loss function derivative
corresponding to a zero-mean Laplace distribution.
general_laplace_cost_der
computes the value at a given point of the loss function derivative
corresponding to a general Laplace distribution.
zero_gaussian_cost_der
computes the value at a given point of the loss function derivative
corresponding to a zero-mean Gaussian distribution.
general_gaussian_cost_der
computes the value at a given point of the loss function derivative
corresponding to a general Gaussian distribution.
beta_cost_der
computes the value at a given point of the loss function derivative
corresponding to a Beta distribution.
weibull_cost_der
computes the value at a given point of the loss function derivative
corresponding to a Weibull distribution.
moge_cost_der
computes the value at a given point of the loss function derivative
corresponding to a MOGE distribution.
ILF_cost_der(phi, epsilon = 0.1, nu = 0) zero_laplace_cost_der(phi, sigma) general_laplace_cost_der(phi, sigma, mu) zero_gaussian_cost_der(phi, sigma_cuad) general_gaussian_cost_der(phi, sigma_cuad, mu) beta_cost_der(phi, alpha, beta) weibull_cost_der(phi, lambda, kappa) moge_cost_der(phi, lambda, alpha, theta)
ILF_cost_der(phi, epsilon = 0.1, nu = 0) zero_laplace_cost_der(phi, sigma) general_laplace_cost_der(phi, sigma, mu) zero_gaussian_cost_der(phi, sigma_cuad) general_gaussian_cost_der(phi, sigma_cuad, mu) beta_cost_der(phi, alpha, beta) weibull_cost_der(phi, lambda, kappa) moge_cost_der(phi, lambda, alpha, theta)
phi |
point to use as argument of the loss function derivative. |
epsilon |
width of the insensitive band. |
nu |
parameter to control value of |
sigma |
scale parameter of the Laplace distribution. |
mu |
location or mean parameter of the Laplace or Gaussian distribution, respectively. |
sigma_cuad |
variance parameter of the Gaussian distribution. |
alpha |
shape1 parameter of the Beta distribution or second parameter of the MOGE distribution. |
beta |
shape2 parameter of the Beta distribution. |
lambda |
lambda scale parameter of the Weibull distribution or first parameter of the MOGE distribution. |
kappa |
shape parameter of the Weibull distribution. |
theta |
third parameter of the MOGE distribution. |
See also 'References'.
Returns a numeric
representing the derivative value at a given point.
Jesus Prada, [email protected]
Link to the scientific paper
Prada, Jesus, and Jose Ramon Dorronsoro. "SVRs and Uncertainty Estimates in Wind Energy Prediction." Advances in Computational Intelligence. Springer International Publishing, 2015. 564-577,
with theoretical background for this package is provided below.
http://link.springer.com/chapter/10.1007/978-3-319-19222-2_47
# ILF derivative value at point phi=1 with default epsilon. ILF_cost_der(1) # ILF derivative value at point phi=1 with epsilon=2. ILF_cost_der(1,2) # Zero-mean Laplace loss function derivative value at point phi=1 with sigma=1. zero_laplace_cost_der(1,1) # General Laplace loss function derivative value at point phi=1 with mu=0 and sigma=1. general_laplace_cost_der(1,1,0) # Zero-mean Gaussian loss function derivative value at point phi=1 with sigma_cuad=1. zero_gaussian_cost_der(1,1) # General Gaussian loss function derivative value at point phi=1 with mu=0 and sigma_cuad=1. general_gaussian_cost_der(1,1,0) # Beta loss function derivative value at point phi=1 with alpha=2 and beta=3. beta_cost_der(1,2,3) # Weibull loss function derivative value at point phi=1 with lambda=2 and kappa=3. weibull_cost_der(1,2,3) # MOGE loss function derivative value at point phi=1 with lambda=2 ,alpha=3 and theta=4. moge_cost_der(1,2,3,4)
# ILF derivative value at point phi=1 with default epsilon. ILF_cost_der(1) # ILF derivative value at point phi=1 with epsilon=2. ILF_cost_der(1,2) # Zero-mean Laplace loss function derivative value at point phi=1 with sigma=1. zero_laplace_cost_der(1,1) # General Laplace loss function derivative value at point phi=1 with mu=0 and sigma=1. general_laplace_cost_der(1,1,0) # Zero-mean Gaussian loss function derivative value at point phi=1 with sigma_cuad=1. zero_gaussian_cost_der(1,1) # General Gaussian loss function derivative value at point phi=1 with mu=0 and sigma_cuad=1. general_gaussian_cost_der(1,1,0) # Beta loss function derivative value at point phi=1 with alpha=2 and beta=3. beta_cost_der(1,2,3) # Weibull loss function derivative value at point phi=1 with lambda=2 and kappa=3. weibull_cost_der(1,2,3) # MOGE loss function derivative value at point phi=1 with lambda=2 ,alpha=3 and theta=4. moge_cost_der(1,2,3,4)
linear_kernel
computes the linear kernel between two given vector, and
.
gaussian_kernel
computes the gaussian kernel between two given vectors, and
.
linear_kernel(x, y, gamma = 0) gaussian_kernel(x, y, gamma)
linear_kernel(x, y, gamma = 0) gaussian_kernel(x, y, gamma)
x |
|
y |
|
gamma |
gaussian kernel parameter |
Linear kernel:
Gaussian kernel:
Returns a numeric
representing the kernel value.
Jesus Prada, [email protected]
# Linear kernel value between point x=c(1,2,3) and point y=c(2,3,4). linear_kernel(c(1,2,3),c(2,3,4)) # Gaussian kernel value between point x=c(1,2,3) and point y=c(2,3,4) with gamma=0.1. gaussian_kernel(c(1,2,3),c(2,3,4),0.1)
# Linear kernel value between point x=c(1,2,3) and point y=c(2,3,4). linear_kernel(c(1,2,3),c(2,3,4)) # Gaussian kernel value between point x=c(1,2,3) and point y=c(2,3,4) with gamma=0.1. gaussian_kernel(c(1,2,3),c(2,3,4),0.1)
mle_parameters
computes the optimal parameters via MLE of
a given distribution.
zero_laplace_mle
computes the optimal parameters via MLE
assuming a zero-mean Laplace as noise distribution.
general_laplace_mle
computes the optimal parameters via MLE
assuming a general Laplace as noise distribution.
zero_gaussian_mle
computes the optimal parameters via MLE
assuming a zero-mean Gaussian as noise distribution.
general_gaussian_mle
computes the optimal parameters via MLE
assuming a general Gaussian as noise distribution.
beta_mle
computes the optimal parameters via MLE
assuming a Beta as noise distribution.
weibull_mle
computes the optimal parameters via MLE
assuming a Weibull as noise distribution.
moge_mle
computes the optimal parameters via MLE
assuming a MOGE as noise distribution.
mle_parameters(phi, dist = "nm", ...) zero_laplace_mle(phi) general_laplace_mle(phi) zero_gaussian_mle(phi) general_gaussian_mle(phi) beta_mle(phi, m1 = mean(phi, na.rm = T), m2 = mean(phi^2, na.rm = T), alpha_0 = (m1 * (m1 - m2))/(m2 - m1^2), beta_0 = (alpha_0 * (1 - m1)/m1)) weibull_mle(phi, k_0 = 1) moge_mle(phi, lambda_0 = 1, alpha_0 = 1, theta_0 = 1)
mle_parameters(phi, dist = "nm", ...) zero_laplace_mle(phi) general_laplace_mle(phi) zero_gaussian_mle(phi) general_gaussian_mle(phi) beta_mle(phi, m1 = mean(phi, na.rm = T), m2 = mean(phi^2, na.rm = T), alpha_0 = (m1 * (m1 - m2))/(m2 - m1^2), beta_0 = (alpha_0 * (1 - m1)/m1)) weibull_mle(phi, k_0 = 1) moge_mle(phi, lambda_0 = 1, alpha_0 = 1, theta_0 = 1)
phi |
a vector with residual values used to estimate the parameters. |
dist |
assumed distribution for the noise in the data. Possible values to take:
|
... |
additional arguments to be passed to the low level functions (see below). |
m1 |
first moment of the residuals. Used to compute |
m2 |
second moment of the residuals. Used to compute |
alpha_0 |
initial value for Newton-Raphson method for the parameter |
beta_0 |
initial value for Newton-Raphson method for the parameter |
k_0 |
initial value for Newton-Raphson method for the parameter |
lambda_0 |
initial value for Newton-Raphson method for the parameter |
theta_0 |
initial value for Newton-Raphson method for the parameter See also 'Details' and multiroot. |
For the zero- Laplace distribution the optimal MLE parameters are
, where are the residuals passed as argument.
For the general Laplace distribution the optimal MLE parameters are
, where are the residuals passed as argument.
For the zero- Gaussian distribution the optimal MLE parameters are
, where are the residuals passed as argument.
For the general Gaussian distribution the optimal MLE parameters are
, where are the residuals passed as argument.
For the Beta distribution values of parameters and
are estimated using Newton-Raphson method.
For the Weibull distribution value of parameter is estimated using Newton-Raphson method
and then estimated value of
is computed using the following closed form that depends on
:
For the MOGE distribution values of parameters ,
and
are estimated using Newton-Raphson method.
See also 'References'.
mle_parameters
returns a list with the estimated parameters. Depending on the distribution
these parameters will be one or more of the following ones:
scale parameter of the Laplace distribution.
location or mean parameter of the Laplace or Gaussian distribution, respectively.
variance parameter of the Gaussian distribution.
shape1 parameter of the Beta distribution or second parameter of the MOGE distribution.
shape2 parameter of the Beta distribution.
shape parameter of the Weibull distribution.
lambda scale parameter of the Weibull distribution or first parameter of the MOGE distribution.
third parameter of the MOGE distribution.
Jesus Prada, [email protected]
Link to the scientific paper
Prada, Jesus, and Jose Ramon Dorronsoro. "SVRs and Uncertainty Estimates in Wind Energy Prediction." Advances in Computational Intelligence. Springer International Publishing, 2015. 564-577,
with theoretical background for this package is provided below.
http://link.springer.com/chapter/10.1007/978-3-319-19222-2_47
# Estimate optimal parameters using default distribution ("nm"). mle_parameters(rnorm(100)) # Estimate optimal parameters using "lm" distribution. mle_parameters(rnorm(100),dist="lm") # Equivalent to mle_parameters(rnorm(100),dist="l") zero_laplace_mle(rnorm(100)) # Equivalent to mle_parameters(rnorm(100),dist="lm") general_laplace_mle(rnorm(100)) # Equivalent to mle_parameters(rnorm(100),dist="n") zero_gaussian_mle(rnorm(100)) # Equivalent to mle_parameters(rnorm(100),dist="nm") general_gaussian_mle(rnorm(100)) # Equivalent to mle_parameters(rnorm(100),dist="b") beta_mle(rnorm(100)) # Equivalent to mle_parameters(rnorm(100),dist="w") weibull_mle(rnorm(100)) # Equivalent to mle_parameters(rnorm(100),dist="moge") moge_mle(rnorm(100))
# Estimate optimal parameters using default distribution ("nm"). mle_parameters(rnorm(100)) # Estimate optimal parameters using "lm" distribution. mle_parameters(rnorm(100),dist="lm") # Equivalent to mle_parameters(rnorm(100),dist="l") zero_laplace_mle(rnorm(100)) # Equivalent to mle_parameters(rnorm(100),dist="lm") general_laplace_mle(rnorm(100)) # Equivalent to mle_parameters(rnorm(100),dist="n") zero_gaussian_mle(rnorm(100)) # Equivalent to mle_parameters(rnorm(100),dist="nm") general_gaussian_mle(rnorm(100)) # Equivalent to mle_parameters(rnorm(100),dist="b") beta_mle(rnorm(100)) # Equivalent to mle_parameters(rnorm(100),dist="w") weibull_mle(rnorm(100)) # Equivalent to mle_parameters(rnorm(100),dist="moge") moge_mle(rnorm(100))
Computes general noise SVR based on NORMA optimization.
NORMA(x, y, f_0 = 0, beta_0 = 0, lambda = 0, rate = function(t) { 1 }, kernel = linear_kernel, cost_der = ILF_cost_der, cost_name = "ILF_cost_der", gamma = 1, max_iterations = nrow(x), stopping_threshold = 0, trace = TRUE, no_beta = TRUE, fixed_epsilon = TRUE, ...)
NORMA(x, y, f_0 = 0, beta_0 = 0, lambda = 0, rate = function(t) { 1 }, kernel = linear_kernel, cost_der = ILF_cost_der, cost_name = "ILF_cost_der", gamma = 1, max_iterations = nrow(x), stopping_threshold = 0, trace = TRUE, no_beta = TRUE, fixed_epsilon = TRUE, ...)
x |
|
y |
|
f_0 |
initial hypothesis. |
beta_0 |
initial value for offset |
lambda |
NORMA optimization parameter |
rate |
learning rate for NORMA optimization. Must be a function with one argument. |
kernel |
kernel function to use. Must be a function with three arguments such as |
cost_der |
Loss function derivative to use. See also ILF_cost_der. Must be "ILF_cost_der" when ILF derivative is used. |
cost_name |
|
gamma |
gaussian kernel parameter |
max_iterations |
maximum number of NORMA iterations computed. |
stopping_threshold |
value indicating when to stop NORMA optimization. See also 'Details'. |
trace |
|
no_beta |
|
fixed_epsilon |
|
... |
additional arguments to be passed to the low level functions. |
Optimization will stop when the sum of the differences between all training predicted values of present
iteration versus values from previous iteration does not exceeds stopping_threshold
.
Returns a list
containing:
matrix
representing parameters of NORMA optimization in each iteration, one per row.
numeric
representing parameter of NORMA optimization in each iteration.
Number of NORMA iterations performed.
Jesus Prada, [email protected]
Link to the scientific paper
Kivinen J., Smola A. J., Williamson R.C.: Online learning with kernels. In: IEEE transactions on signal processing, vol. 52, pp. 2165-2176, IEEE (2004).
with theoretical background for NORMA optimization is provided below.
http://realm.sics.se/papers/KivSmoWil04(1).pdf
NORMA(x=matrix(rnorm(10),nrow=10,ncol=1,byrow=TRUE),y=rnorm(10),kernel=function(x,y,gamma=0){x%*%y}, cost_der=function(phi,sigma_cuad,mu){return((phi-mu)/sigma_cuad)},cost_name="example", sigma_cuad=1,mu=0)
NORMA(x=matrix(rnorm(10),nrow=10,ncol=1,byrow=TRUE),y=rnorm(10),kernel=function(x,y,gamma=0){x%*%y}, cost_der=function(phi,sigma_cuad,mu){return((phi-mu)/sigma_cuad)},cost_name="example", sigma_cuad=1,mu=0)