openGPMP
Open Source Mathematics Package
Public Member Functions | Private Member Functions | Private Attributes | List of all members
gpmp::ml::KNN Class Reference

Represents a K Nearest Neighbors (KNN) classifier. More...

#include <knn.hpp>

Public Member Functions

 KNN ()
 Constructor for the KNN class. More...
 
 ~KNN ()
 Destructor for the KNN class. More...
 
void train (const std::vector< std::vector< double >> &training_data, const std::vector< int > &labels)
 Trains the KNN model with the given training data and labels. More...
 
int predict (const std::vector< double > &input_vector, int k)
 Predicts the label of a given input vector using KNN algorithm. More...
 

Private Member Functions

double calculateEuclideanDistance (const std::vector< double > &vec1, const std::vector< double > &vec2)
 Calculates the Euclidean distance between two vectors. More...
 

Private Attributes

std::vector< std::vector< double > > training_data
 
std::vector< int > labels
 

Detailed Description

Represents a K Nearest Neighbors (KNN) classifier.

Definition at line 52 of file knn.hpp.

Constructor & Destructor Documentation

◆ KNN()

gpmp::ml::KNN::KNN ( )

Constructor for the KNN class.

Definition at line 39 of file knn.cpp.

39  {
40 }

◆ ~KNN()

gpmp::ml::KNN::~KNN ( )

Destructor for the KNN class.

Definition at line 42 of file knn.cpp.

42  {
43 }

Member Function Documentation

◆ calculateEuclideanDistance()

double gpmp::ml::KNN::calculateEuclideanDistance ( const std::vector< double > &  vec1,
const std::vector< double > &  vec2 
)
private

Calculates the Euclidean distance between two vectors.

Parameters
vec1The first vector
vec2The second vector
Returns
The Euclidean distance between the two vectors

Definition at line 107 of file knn.cpp.

108  {
109  double distance = 0.0;
110  for (size_t i = 0; i < vec1.size(); ++i) {
111  double diff = vec1[i] - vec2[i];
112  distance += diff * diff;
113  }
114  return std::sqrt(distance);
115 }

◆ predict()

int gpmp::ml::KNN::predict ( const std::vector< double > &  input_vector,
int  k 
)

Predicts the label of a given input vector using KNN algorithm.

Parameters
input_vectorThe input vector for which prediction is to be made
kThe number of nearest neighbors to consider
Returns
The predicted label

Definition at line 56 of file knn.cpp.

56  {
57  if (training_data.empty() || labels.empty()) {
58  throw std::logic_error(
59  "Model not trained. Call train() before predict.");
60  }
61 
62  if (input_vector.size() != training_data[0].size()) {
63  throw std::invalid_argument("Invalid input vector size.");
64  }
65 
66  // if (k <= 0 || k > training_data.size()) {
67  if (k <= 0 || static_cast<size_t>(k) > training_data.size()) {
68  throw std::invalid_argument("Invalid value of k.");
69  }
70 
71  // Calculate distances and store index-label pairs
72  std::vector<std::pair<double, int>> distances;
73  for (size_t i = 0; i < training_data.size(); ++i) {
74  double distance =
75  calculateEuclideanDistance(input_vector, training_data[i]);
76  distances.emplace_back(distance, labels[i]);
77  }
78 
79  // Sort distances in ascending order
80  std::sort(
81  distances.begin(),
82  distances.end(),
83  [](const std::pair<double, int> &a, const std::pair<double, int> &b) {
84  return a.first < b.first;
85  });
86 
87  // Count votes for each label among the k nearest neighbors
88  std::unordered_map<int, int> label_counts;
89  for (int i = 0; i < k; ++i) {
90  label_counts[distances[i].second]++;
91  }
92 
93  // Find the label with the maximum votes
94  int max_votes = -1;
95  int predicted_label = -1;
96  for (const auto &entry : label_counts) {
97  if (entry.second > max_votes) {
98  max_votes = entry.second;
99  predicted_label = entry.first;
100  }
101  }
102 
103  return predicted_label;
104 }
std::vector< std::vector< double > > training_data
Definition: knn.hpp:82
std::vector< int > labels
Definition: knn.hpp:84
double calculateEuclideanDistance(const std::vector< double > &vec1, const std::vector< double > &vec2)
Calculates the Euclidean distance between two vectors.
Definition: knn.cpp:107

◆ train()

void gpmp::ml::KNN::train ( const std::vector< std::vector< double >> &  training_data,
const std::vector< int > &  labels 
)

Trains the KNN model with the given training data and labels.

Parameters
training_dataThe training data, a vector of feature vectors
labelsThe corresponding labels for the training data

Definition at line 45 of file knn.cpp.

47  {
48  if (_training_data.size() != _labels.size() || _training_data.empty()) {
49  throw std::invalid_argument("Invalid training data or labels.");
50  }
51 
52  this->training_data = _training_data;
53  this->labels = _labels;
54 }

Member Data Documentation

◆ labels

std::vector<int> gpmp::ml::KNN::labels
private

Definition at line 84 of file knn.hpp.

◆ training_data

std::vector<std::vector<double> > gpmp::ml::KNN::training_data
private

< The training data stored as feature vectors The corresponding labels for the training data

Definition at line 82 of file knn.hpp.


The documentation for this class was generated from the following files: