openGPMP
Open Source Mathematics Package
Public Member Functions | Public Attributes | List of all members
gpmp::ml::BayesBernoulli Class Reference

#include <bayes_clf.hpp>

Public Member Functions

 BayesBernoulli (double alpha_param=1.0)
 Constructor for BayesBernoulli class. More...
 
 ~BayesBernoulli ()
 Destructor for BayesBernoulli class. More...
 
void train (const std::vector< std::vector< size_t >> &data, const std::vector< std::string > &labels)
 Train the classifier with a set of labeled data. More...
 
std::string predict (const std::vector< size_t > &newData) const
 Predict the class of a new data point. More...
 
void display () const
 Display the learned probabilities. More...
 

Public Attributes

std::unordered_map< std::string, double > class_probs
 
std::unordered_map< std::string, std::unordered_map< size_t, double > > feat_probs
 
double alpha
 

Detailed Description

Bernoulli Naive Bayes is a part of the Naive Bayes family. It is based on the Bernoulli Distribution and accepts only binary values, i.e., 0 or 1. If the features of the dataset are binary, then we can assume that Bernoulli Naive Bayes is the algorithm to be used.

Definition at line 123 of file bayes_clf.hpp.

Constructor & Destructor Documentation

◆ BayesBernoulli()

gpmp::ml::BayesBernoulli::BayesBernoulli ( double  alpha_param = 1.0)
inline

Constructor for BayesBernoulli class.

Parameters
alphaAdditive (Laplace/Lidstone) smoothing parameter

Definition at line 134 of file bayes_clf.hpp.

134  : alpha(alpha_param) {
135  }

◆ ~BayesBernoulli()

gpmp::ml::BayesBernoulli::~BayesBernoulli ( )
inline

Destructor for BayesBernoulli class.

Definition at line 140 of file bayes_clf.hpp.

140  {
141  }

Member Function Documentation

◆ display()

void gpmp::ml::BayesBernoulli::display ( ) const

Display the learned probabilities.

Note
This method is for debugging purposes

Definition at line 212 of file bayes_clf.cpp.

212  {
213  std::cout << "Class Probabilities:\n";
214  for (const auto &entry : class_probs) {
215  std::cout << entry.first << ": " << entry.second << "\n";
216  }
217 
218  std::cout << "\nFeature Probabilities:\n";
219  for (const auto &class_entry : feat_probs) {
220  std::cout << class_entry.first << ":\n";
221  for (const auto &feat_entry : class_entry.second) {
222  std::cout << " Feature " << feat_entry.first << ": "
223  << feat_entry.second << "\n";
224  }
225  }
226 }
std::unordered_map< std::string, double > class_probs
Definition: bayes_clf.hpp:125
std::unordered_map< std::string, std::unordered_map< size_t, double > > feat_probs
Definition: bayes_clf.hpp:127

Referenced by main().

◆ predict()

std::string gpmp::ml::BayesBernoulli::predict ( const std::vector< size_t > &  newData) const

Predict the class of a new data point.

Parameters
newDataA vector of size_t representing the features of the new data point
Returns
The predicted class label as a string

Definition at line 191 of file bayes_clf.cpp.

191  {
192  double max_prob = -std::numeric_limits<double>::infinity();
193  std::string predicted_class;
194 
195  for (const auto &class_entry : class_probs) {
196  double probability = log(class_entry.second);
197 
198  for (size_t i = 0; i < new_data.size(); ++i) {
199  probability +=
200  new_data[i] * log(feat_probs.at(class_entry.first).at(i));
201  }
202 
203  if (probability > max_prob) {
204  max_prob = probability;
205  predicted_class = class_entry.first;
206  }
207  }
208 
209  return predicted_class;
210 }

Referenced by main().

◆ train()

void gpmp::ml::BayesBernoulli::train ( const std::vector< std::vector< size_t >> &  data,
const std::vector< std::string > &  labels 
)

Train the classifier with a set of labeled data.

Parameters
dataA vector of vectors representing the training instances
labelsA vector of strings representing the corresponding class labels

Definition at line 155 of file bayes_clf.cpp.

157  {
158  size_t numInstances = data.size();
159  size_t num_feats = data[0].size();
160 
161  for (size_t i = 0; i < numInstances; ++i) {
162  std::string classLabel = labels[i];
163 
164  // update class probabilities
165  class_probs[classLabel] += 1.0;
166 
167  // update feature probabilities
168  for (size_t j = 0; j < num_feats; ++j) {
169  feat_probs[classLabel][j] += data[i][j];
170  }
171  }
172 
173  // laplace smoothing
174  double smoothing_factor = alpha * 2.0;
175  for (auto &entry : class_probs) {
176  entry.second =
177  (entry.second + alpha) / (numInstances + smoothing_factor);
178  }
179 
180  for (auto &class_entry : feat_probs) {
181  for (auto &feat_entry : class_entry.second) {
182  feat_entry.second =
183  (feat_entry.second + alpha) /
184  (class_probs[class_entry.first] + smoothing_factor);
185  }
186  }
187 }

Referenced by main().

Member Data Documentation

◆ alpha

double gpmp::ml::BayesBernoulli::alpha

Definition at line 128 of file bayes_clf.hpp.

◆ class_probs

std::unordered_map<std::string, double> gpmp::ml::BayesBernoulli::class_probs

Definition at line 125 of file bayes_clf.hpp.

◆ feat_probs

std::unordered_map<std::string, std::unordered_map<size_t, double> > gpmp::ml::BayesBernoulli::feat_probs

Definition at line 127 of file bayes_clf.hpp.


The documentation for this class was generated from the following files: