openGPMP
Open Source Mathematics Package
Public Member Functions | Public Attributes | List of all members
gpmp::ml::SVC Class Reference

Support Vector Classifier (SVC) for binary classification using Stochastic Gradient Descent. More...

#include <svc.hpp>

Public Member Functions

 SVC (double C_=1.0, double l_rate=0.01, int max_iters=1000, double tol=1e-4)
 Constructor for SVC class. More...
 
void fit (const std::vector< std::vector< double >> &X_train, const std::vector< int > &y_train)
 Fit the SVC model to the training data. More...
 
std::vector< int > predict (const std::vector< std::vector< double >> &X_test)
 Predict labels for given test data. More...
 
std::vector< double > predict_proba (const std::vector< std::vector< double >> &X_test)
 Predict class probabilities for given test data. More...
 
double score (const std::vector< std::vector< double >> &X_test, const std::vector< int > &y_test)
 Calculate the accuracy of the model on given test data. More...
 
void set_kernel (const std::string &k_type)
 Set the kernel type for the SVC. More...
 
void set_kernel_parameters (double k_param)
 Set the kernel parameters for the SVC. More...
 
void set_random_state (int seed)
 Set the random seed for reproducibility. More...
 
void set_verbose (bool vbose)
 Enable or disable verbose output during training. More...
 
void set_penalty (const std::string &p_type)
 Set the penalty type for regularization. More...
 
double cross_val_score (const std::vector< std::vector< double >> &X, const std::vector< int > &y, int cv=5)
 Perform k-fold cross-validation on the model. More...
 
std::vector< double > grid_search (const std::vector< std::vector< double >> &X, const std::vector< int > &y, const std::vector< double > &C_values, const std::vector< double > &kernel_params, int cv=5)
 Perform grid search for hyperparameter tuning. More...
 
double hinge_loss (double prediction, int label)
 Compute the hinge loss for a given prediction and true label. More...
 
double compute_loss (const std::vector< std::vector< double >> &X, const std::vector< int > &y)
 Compute the total loss (including regularization) for the model. More...
 
void update_weights (const std::vector< std::vector< double >> &X, const std::vector< int > &y)
 Update weights and bias using stochastic gradient descent. More...
 
double kernel (const std::vector< double > &x1, const std::vector< double > &x2)
 Compute the kernel function between two vectors. More...
 
double dot_product (const std::vector< double > &x1, const std::vector< double > &x2)
 Compute the dot product between two vectors. More...
 
double sigmoid (double z)
 Sigmoid activation function. More...
 
std::vector< int > k_fold_indices (int num_instances, int k)
 Generate k-fold indices for cross-validation. More...
 
double accuracy (const std::vector< int > &predictions, const std::vector< int > &labels)
 Compute the accuracy of predictions. More...
 

Public Attributes

double C
 
double learning_rate
 
int max_iterations
 
double tolerance
 
std::string kernel_type
 
double kernel_param
 
int random_state
 
bool verbose
 
std::string penalty_type
 
std::vector< double > weights
 
double bias
 

Detailed Description

Support Vector Classifier (SVC) for binary classification using Stochastic Gradient Descent.

Definition at line 50 of file svc.hpp.

Constructor & Destructor Documentation

◆ SVC()

gpmp::ml::SVC::SVC ( double  C_ = 1.0,
double  l_rate = 0.01,
int  max_iters = 1000,
double  tol = 1e-4 
)

Constructor for SVC class.

Parameters
C_Regularization parameter (default: 1.0)
l_rateLearning rate for stochastic gradient descent (default: 0.01)
max_itersMaximum number of iterations for training (default: 1000)
tolTolerance for convergence (default: 1e-4)

Definition at line 38 of file svc.cpp.

39  : C(C_), learning_rate(l_rate), max_iterations(max_iters), tolerance(tol) {
40 }
double learning_rate
Definition: svc.hpp:157
double tolerance
Definition: svc.hpp:161
int max_iterations
Definition: svc.hpp:159
double C
Definition: svc.hpp:155

Member Function Documentation

◆ accuracy()

double gpmp::ml::SVC::accuracy ( const std::vector< int > &  predictions,
const std::vector< int > &  labels 
)

Compute the accuracy of predictions.

Parameters
predictionsPredicted labels
labelsTrue labels
Returns
Accuracy

Definition at line 249 of file svc.cpp.

250  {
251  int correct = 0;
252  for (size_t i = 0; i < predictions.size(); ++i) {
253  if (predictions[i] == labels[i]) {
254  correct++;
255  }
256  }
257  return static_cast<double>(correct) / predictions.size();
258 }

◆ compute_loss()

double gpmp::ml::SVC::compute_loss ( const std::vector< std::vector< double >> &  X,
const std::vector< int > &  y 
)

Compute the total loss (including regularization) for the model.

Parameters
XInput features
yTrue labels
Returns
Total loss

Definition at line 79 of file svc.cpp.

80  {
81  double loss = 0.0;
82  for (size_t i = 0; i < X.size(); ++i) {
83  double prediction = 0.0;
84  for (size_t j = 0; j < X[i].size(); ++j) {
85  prediction += X[i][j] * weights[j];
86  }
87  prediction += bias;
88  loss += hinge_loss(prediction, y[i]);
89  }
90  // Add L2 regularization
91  for (double weight : weights) {
92  loss += 0.5 * C * weight * weight;
93  }
94  return loss / X.size();
95 }
std::vector< double > weights
Definition: svc.hpp:173
double bias
Definition: svc.hpp:175
double hinge_loss(double prediction, int label)
Compute the hinge loss for a given prediction and true label.
Definition: svc.cpp:75

References python.linalg::C.

◆ cross_val_score()

double gpmp::ml::SVC::cross_val_score ( const std::vector< std::vector< double >> &  X,
const std::vector< int > &  y,
int  cv = 5 
)

Perform k-fold cross-validation on the model.

Parameters
XInput features for cross-validation
yTrue labels for cross-validation
cvNumber of folds (default: 5)
Returns
Average accuracy over the folds

Definition at line 158 of file svc.cpp.

160  {
161  std::vector<int> fold_sizes = k_fold_indices(X.size(), cv);
162  double avg_score = 0.0;
163  for (int i = 0; i < cv; ++i) {
164  std::vector<std::vector<double>> X_train, X_valid;
165  std::vector<int> y_train, y_valid;
166  int start = 0;
167  for (int j = 0; j < cv; ++j) {
168  if (j != i) {
169  int end = start + fold_sizes[j];
170  for (int k = start; k < end; ++k) {
171  X_train.push_back(X[k]);
172  y_train.push_back(y[k]);
173  }
174  } else {
175  int end = start + fold_sizes[j];
176  for (int k = start; k < end; ++k) {
177  X_valid.push_back(X[k]);
178  y_valid.push_back(y[k]);
179  }
180  }
181  start += fold_sizes[j];
182  }
183  fit(X_train, y_train);
184  double score_val = score(X_valid, y_valid);
185  if (verbose) {
186  std::cout << "Cross-validation fold " << i + 1
187  << " accuracy: " << score_val << std::endl;
188  }
189  avg_score += score_val;
190  }
191  return avg_score / cv;
192 }
double score(const std::vector< std::vector< double >> &X_test, const std::vector< int > &y_test)
Calculate the accuracy of the model on given test data.
Definition: svc.cpp:132
std::vector< int > k_fold_indices(int num_instances, int k)
Generate k-fold indices for cross-validation.
Definition: svc.cpp:240
bool verbose
Definition: svc.hpp:169
void fit(const std::vector< std::vector< double >> &X_train, const std::vector< int > &y_train)
Fit the SVC model to the training data.
Definition: svc.cpp:42

◆ dot_product()

double gpmp::ml::SVC::dot_product ( const std::vector< double > &  x1,
const std::vector< double > &  x2 
)

Compute the dot product between two vectors.

Parameters
x1First vector
x2Second vector
Returns
Dot product

Definition at line 227 of file svc.cpp.

228  {
229  double result = 0.0;
230  for (size_t i = 0; i < x1.size(); ++i) {
231  result += x1[i] * x2[i];
232  }
233  return result;
234 }

◆ fit()

void gpmp::ml::SVC::fit ( const std::vector< std::vector< double >> &  X_train,
const std::vector< int > &  y_train 
)

Fit the SVC model to the training data.

Parameters
X_trainInput features for training
y_trainLabels for training

Definition at line 42 of file svc.cpp.

43  {
44  // Initialize weights and bias
45  weights.resize(X_train[0].size(), 0.0);
46  bias = 0.0;
47 
48  // Stochastic Gradient Descent
49  for (int iter = 0; iter < max_iterations; ++iter) {
50  update_weights(X_train, y_train);
51 
52  // Check convergence
53  double loss = compute_loss(X_train, y_train);
54  if (loss < tolerance) {
55  break;
56  }
57  }
58 }
void update_weights(const std::vector< std::vector< double >> &X, const std::vector< int > &y)
Update weights and bias using stochastic gradient descent.
Definition: svc.cpp:97
double compute_loss(const std::vector< std::vector< double >> &X, const std::vector< int > &y)
Compute the total loss (including regularization) for the model.
Definition: svc.cpp:79

◆ grid_search()

std::vector< double > gpmp::ml::SVC::grid_search ( const std::vector< std::vector< double >> &  X,
const std::vector< int > &  y,
const std::vector< double > &  C_values,
const std::vector< double > &  kernel_params,
int  cv = 5 
)

Perform grid search for hyperparameter tuning.

Parameters
XInput features for grid search
yTrue labels for grid search
C_valuesCandidate values for the regularization parameter
kernel_paramsCandidate values for the kernel parameter
cvNumber of folds for cross-validation (default: 5)
Returns
Best hyperparameters found Regularization parameter

Definition at line 195 of file svc.cpp.

199  {
200  std::vector<double> best_params;
201  double best_score = 0.0;
202  for (double val : C_values) {
203  for (double param : kernel_params) {
204  set_kernel_parameters(param);
205  set_penalty("l2"); // Default penalty type
206  set_verbose(false); // Suppress verbose output
207  double score = cross_val_score(X, y, cv);
208  if (score > best_score) {
209  best_score = score;
210  best_params = {val, param};
211  }
212  }
213  }
214  return best_params;
215 }
void set_verbose(bool vbose)
Enable or disable verbose output during training.
Definition: svc.cpp:150
double cross_val_score(const std::vector< std::vector< double >> &X, const std::vector< int > &y, int cv=5)
Perform k-fold cross-validation on the model.
Definition: svc.cpp:158
void set_kernel_parameters(double k_param)
Set the kernel parameters for the SVC.
Definition: svc.cpp:142
void set_penalty(const std::string &p_type)
Set the penalty type for regularization.
Definition: svc.cpp:154

◆ hinge_loss()

double gpmp::ml::SVC::hinge_loss ( double  prediction,
int  label 
)

Compute the hinge loss for a given prediction and true label.

Parameters
predictionPredicted value
labelTrue label
Returns
Hinge loss

Definition at line 75 of file svc.cpp.

75  {
76  return fmax(0, 1 - label * prediction);
77 }

◆ k_fold_indices()

std::vector< int > gpmp::ml::SVC::k_fold_indices ( int  num_instances,
int  k 
)

Generate k-fold indices for cross-validation.

Parameters
num_instancesTotal number of instances
kNumber of folds
Returns
Vector of fold sizes

Definition at line 240 of file svc.cpp.

240  {
241  std::vector<int> fold_sizes(k, num_instances / k);
242  int remainder = num_instances % k;
243  for (int i = 0; i < remainder; ++i) {
244  fold_sizes[i]++;
245  }
246  return fold_sizes;
247 }

◆ kernel()

double gpmp::ml::SVC::kernel ( const std::vector< double > &  x1,
const std::vector< double > &  x2 
)

Compute the kernel function between two vectors.

Parameters
x1First vector
x2Second vector
Returns
Kernel value

Definition at line 217 of file svc.cpp.

218  {
219  if (kernel_type == "linear") {
220  return dot_product(x1, x2);
221  } else {
222  // Default to linear kernel if unknown kernel type
223  return dot_product(x1, x2);
224  }
225 }
double dot_product(const std::vector< double > &x1, const std::vector< double > &x2)
Compute the dot product between two vectors.
Definition: svc.cpp:227
std::string kernel_type
Definition: svc.hpp:163

References gpmp::linalg::dot_product().

◆ predict()

std::vector< int > gpmp::ml::SVC::predict ( const std::vector< std::vector< double >> &  X_test)

Predict labels for given test data.

Parameters
X_testInput features for prediction
Returns
Predicted labels

Definition at line 61 of file svc.cpp.

61  {
62  std::vector<int> predictions;
63  for (const auto &instance : X_test) {
64  double score = 0.0;
65  for (size_t i = 0; i < instance.size(); ++i) {
66  score += instance[i] * weights[i];
67  }
68  score += bias;
69  int prediction = (score >= 0) ? 1 : -1;
70  predictions.push_back(prediction);
71  }
72  return predictions;
73 }

◆ predict_proba()

std::vector< double > gpmp::ml::SVC::predict_proba ( const std::vector< std::vector< double >> &  X_test)

Predict class probabilities for given test data.

Parameters
X_testInput features for prediction
Returns
Predicted class probabilities

Definition at line 118 of file svc.cpp.

118  {
119  std::vector<double> probabilities;
120  for (const auto &instance : X_test) {
121  double score = 0.0;
122  for (size_t i = 0; i < instance.size(); ++i) {
123  score += instance[i] * weights[i];
124  }
125  score += bias;
126  double prob = sigmoid(score);
127  probabilities.push_back(prob);
128  }
129  return probabilities;
130 }
double sigmoid(double z)
Sigmoid activation function.
Definition: svc.cpp:236

◆ score()

double gpmp::ml::SVC::score ( const std::vector< std::vector< double >> &  X_test,
const std::vector< int > &  y_test 
)

Calculate the accuracy of the model on given test data.

Parameters
X_testInput features for evaluation
y_testTrue labels for evaluation
Returns
Accuracy of the model

Definition at line 132 of file svc.cpp.

133  {
134  std::vector<int> predictions = predict(X_test);
135  return accuracy(predictions, y_test);
136 }
double accuracy(const std::vector< int > &predictions, const std::vector< int > &labels)
Compute the accuracy of predictions.
Definition: svc.cpp:249
std::vector< int > predict(const std::vector< std::vector< double >> &X_test)
Predict labels for given test data.
Definition: svc.cpp:61

◆ set_kernel()

void gpmp::ml::SVC::set_kernel ( const std::string &  k_type)

Set the kernel type for the SVC.

Parameters
k_typeKernel type (eg, "linear")

Definition at line 138 of file svc.cpp.

138  {
139  this->kernel_type = k_type;
140 }

◆ set_kernel_parameters()

void gpmp::ml::SVC::set_kernel_parameters ( double  k_param)

Set the kernel parameters for the SVC.

Parameters
k_paramKernel parameter value

Definition at line 142 of file svc.cpp.

142  {
143  this->kernel_param = k_param;
144 }
double kernel_param
Definition: svc.hpp:165

◆ set_penalty()

void gpmp::ml::SVC::set_penalty ( const std::string &  p_type)

Set the penalty type for regularization.

Parameters
p_typePenalty type (eg, "l2")

Definition at line 154 of file svc.cpp.

154  {
155  this->penalty_type = p_type;
156 }
std::string penalty_type
Definition: svc.hpp:171

◆ set_random_state()

void gpmp::ml::SVC::set_random_state ( int  seed)

Set the random seed for reproducibility.

Parameters
seedRandom seed

Definition at line 146 of file svc.cpp.

146  {
147  this->random_state = seed;
148 }
int random_state
Definition: svc.hpp:167

◆ set_verbose()

void gpmp::ml::SVC::set_verbose ( bool  vbose)

Enable or disable verbose output during training.

Parameters
vboseVerbose flag

Definition at line 150 of file svc.cpp.

150  {
151  this->verbose = vbose;
152 }

◆ sigmoid()

double gpmp::ml::SVC::sigmoid ( double  z)

Sigmoid activation function.

Parameters
zInput value
Returns
Sigmoid value

Definition at line 236 of file svc.cpp.

236  {
237  return 1.0 / (1.0 + exp(-z));
238 }

◆ update_weights()

void gpmp::ml::SVC::update_weights ( const std::vector< std::vector< double >> &  X,
const std::vector< int > &  y 
)

Update weights and bias using stochastic gradient descent.

Parameters
XInput features
yTrue labels

Definition at line 97 of file svc.cpp.

98  {
99  for (size_t i = 0; i < X.size(); ++i) {
100  double prediction = 0.0;
101  for (size_t j = 0; j < X[i].size(); ++j) {
102  prediction += X[i][j] * weights[j];
103  }
104  prediction += bias;
105  double loss_grad = -y[i] * (1 - prediction);
106  if (loss_grad > 0) {
107  // Update weights
108  for (size_t j = 0; j < X[i].size(); ++j) {
109  weights[j] -= learning_rate * (C * weights[j] - y[i] * X[i][j]);
110  }
111  // Update bias
112  bias -= learning_rate * y[i];
113  }
114  }
115 }

References python.linalg::C.

Member Data Documentation

◆ bias

double gpmp::ml::SVC::bias

Definition at line 175 of file svc.hpp.

◆ C

double gpmp::ml::SVC::C

Learning rate for stochastic gradient descent

Definition at line 155 of file svc.hpp.

◆ kernel_param

double gpmp::ml::SVC::kernel_param

Random seed for reproducibility

Definition at line 165 of file svc.hpp.

◆ kernel_type

std::string gpmp::ml::SVC::kernel_type

Kernel parameter for the selected kernel

Definition at line 163 of file svc.hpp.

◆ learning_rate

double gpmp::ml::SVC::learning_rate

Maximum number of iterations for training

Definition at line 157 of file svc.hpp.

◆ max_iterations

int gpmp::ml::SVC::max_iterations

Tolerance for convergence

Definition at line 159 of file svc.hpp.

◆ penalty_type

std::string gpmp::ml::SVC::penalty_type

Model weights

Definition at line 171 of file svc.hpp.

◆ random_state

int gpmp::ml::SVC::random_state

Verbose flag for training output

Definition at line 167 of file svc.hpp.

◆ tolerance

double gpmp::ml::SVC::tolerance

Kernel type for SVC

Definition at line 161 of file svc.hpp.

◆ verbose

bool gpmp::ml::SVC::verbose

Penalty type for regularization

Definition at line 169 of file svc.hpp.

◆ weights

std::vector<double> gpmp::ml::SVC::weights

Model bias

Definition at line 173 of file svc.hpp.


The documentation for this class was generated from the following files: