openGPMP
Open Source Mathematics Package
Public Member Functions | Public Attributes | List of all members
gpmp::ml::BayesClf Class Reference

Bayes Classifier Class based on assumptions of independence. More...

#include <bayes_clf.hpp>

Public Member Functions

 BayesClf (double alpha_param=1.0, bool fit_prior_param=true, const std::vector< double > &class_prior={})
 Constructor for BayesClf class. More...
 
 ~BayesClf ()
 Destructor for BayesClf class. More...
 
void train (const std::vector< std::vector< double >> &data, const std::vector< std::string > &labels)
 Train the classifier with a set of labeled data. More...
 
std::string predict (const std::vector< double > &newData) const
 Predict the class of a new data point. More...
 
void display () const
 Display the learned probabilities. More...
 

Public Attributes

double alpha
 Additive smoothing parameter. More...
 
bool fit_prior
 Whether to learn class prior probabilities or not. More...
 
std::unordered_map< std::string, double > class_probs
 Map of class labels to their probabilities. More...
 
std::unordered_map< std::string, std::vector< double > > feature_probs
 Map of class labels to their feature probabilities. More...
 
std::vector< double > class_log_prior
 Vector of class log priors. More...
 

Detailed Description

Bayes Classifier Class based on assumptions of independence.

Definition at line 53 of file bayes_clf.hpp.

Constructor & Destructor Documentation

◆ BayesClf()

gpmp::ml::BayesClf::BayesClf ( double  alpha_param = 1.0,
bool  fit_prior_param = true,
const std::vector< double > &  class_prior = {} 
)

Constructor for BayesClf class.

Parameters
alphaAdditive (Laplace/Lidstone) smoothing parameter
fit_priorWhether to learn class prior probabilities or not
class_priorPrior probabilities of the classes

Definition at line 42 of file bayes_clf.cpp.

45  : alpha(alpha_param), fit_prior(fit_prior_param),
46  class_log_prior(class_prior.begin(), class_prior.end()) {
47 }
bool fit_prior
Whether to learn class prior probabilities or not.
Definition: bayes_clf.hpp:63
std::vector< double > class_log_prior
Vector of class log priors.
Definition: bayes_clf.hpp:75
double alpha
Additive smoothing parameter.
Definition: bayes_clf.hpp:58

◆ ~BayesClf()

gpmp::ml::BayesClf::~BayesClf ( )

Destructor for BayesClf class.

Definition at line 49 of file bayes_clf.cpp.

49  {
50 }

Member Function Documentation

◆ display()

void gpmp::ml::BayesClf::display ( ) const

Display the learned probabilities.

Note
This method is for debugging purposes

Definition at line 135 of file bayes_clf.cpp.

135  {
136  std::cout << "Class Probabilities:\n";
137  for (const auto &entry : class_probs) {
138  std::cout << entry.first << ": " << entry.second << "\n";
139  }
140 
141  std::cout << "\nFeature Probabilities:\n";
142  for (const auto &class_entry : feature_probs) {
143  std::cout << class_entry.first << ":\n";
144  for (size_t j = 0; j < class_entry.second.size(); ++j) {
145  std::cout << " Feature " << j << ": " << class_entry.second[j]
146  << "\n";
147  }
148  }
149 
150  std::cout << "\nClass Log Priors:\n";
151  for (const auto &logPrior : class_log_prior) {
152  std::cout << logPrior << "\n";
153  }
154 }
std::unordered_map< std::string, double > class_probs
Map of class labels to their probabilities.
Definition: bayes_clf.hpp:67
std::unordered_map< std::string, std::vector< double > > feature_probs
Map of class labels to their feature probabilities.
Definition: bayes_clf.hpp:71

◆ predict()

std::string gpmp::ml::BayesClf::predict ( const std::vector< double > &  newData) const

Predict the class of a new data point.

Parameters
newDataA vector of doubles representing the features of the new data point
Returns
The predicted class label as a string

Definition at line 114 of file bayes_clf.cpp.

114  {
115  double max_prob = -std::numeric_limits<double>::infinity();
116  std::string predicted_class;
117 
118  for (const auto &entry : class_probs) {
119  const std::string &label = entry.first;
120  double probability = log(entry.second);
121 
122  for (size_t j = 0; j < new_data.size(); ++j) {
123  probability += new_data[j] * log(feature_probs.at(label).at(j));
124  }
125 
126  if (probability > max_prob) {
127  max_prob = probability;
128  predicted_class = label;
129  }
130  }
131 
132  return predicted_class;
133 }

◆ train()

void gpmp::ml::BayesClf::train ( const std::vector< std::vector< double >> &  data,
const std::vector< std::string > &  labels 
)

Train the classifier with a set of labeled data.

Parameters
dataA vector of vectors representing the training instances
labelsA vector of strings representing the corresponding class labels

Definition at line 52 of file bayes_clf.cpp.

53  {
54  // count class occurrences
55  for (const auto &label : labels) {
56  class_probs[label] += 1.0;
57  }
58 
59  // count feature occurrences for each class
60  for (size_t i = 0; i < data.size(); ++i) {
61  const std::string &label = labels[i];
62  const std::vector<double> &features = data[i];
63 
64  class_probs[label] += 1.0;
65 
66  // initialize feature_probs[label] if not present
67  if (feature_probs.find(label) == feature_probs.end()) {
68  feature_probs[label] = std::vector<double>(features.size(), 0.0);
69  }
70 
71  for (size_t j = 0; j < features.size(); ++j) {
72  feature_probs[label][j] += features[j];
73  }
74  }
75 
76  // calculate class probabilities and feature probabilities
77  double smoothing_factor = alpha * 2.0;
78  for (const auto &entry : class_probs) {
79  const std::string &label = entry.first;
80  double class_count = entry.second;
81 
82  // calculate class probability
83  class_probs[label] =
84  (class_count + alpha) / (data.size() + smoothing_factor);
85 
86  // calculate feature probabilities
87  for (size_t j = 0; j < feature_probs[label].size(); ++j) {
88  feature_probs[label][j] = (feature_probs[label][j] + alpha) /
89  (class_count + smoothing_factor);
90  }
91  }
92 
93  // calculate class log priors
94  if (fit_prior) {
95  double total = std::accumulate(
96  class_probs.begin(),
97  class_probs.end(),
98  0.0,
99  [](double sum, const auto &entry) { return sum + entry.second; });
100 
101  for (auto &entry : class_probs) {
102  entry.second /= total;
103  }
104 
105  std::transform(
106  class_probs.begin(),
107  class_probs.end(),
108  class_log_prior.begin(),
109  [total](const auto &entry) { return log(entry.second); });
110  }
111 }

Referenced by main().

Member Data Documentation

◆ alpha

double gpmp::ml::BayesClf::alpha

Additive smoothing parameter.

Definition at line 58 of file bayes_clf.hpp.

◆ class_log_prior

std::vector<double> gpmp::ml::BayesClf::class_log_prior

Vector of class log priors.

Definition at line 75 of file bayes_clf.hpp.

◆ class_probs

std::unordered_map<std::string, double> gpmp::ml::BayesClf::class_probs

Map of class labels to their probabilities.

Definition at line 67 of file bayes_clf.hpp.

◆ feature_probs

std::unordered_map<std::string, std::vector<double> > gpmp::ml::BayesClf::feature_probs

Map of class labels to their feature probabilities.

Definition at line 71 of file bayes_clf.hpp.

◆ fit_prior

bool gpmp::ml::BayesClf::fit_prior

Whether to learn class prior probabilities or not.

Definition at line 63 of file bayes_clf.hpp.


The documentation for this class was generated from the following files: