Naive-Bayes Spam Classificator

Module is created to build a spam filter using Python and the multinomial Naive Bayes algorithm. Main goal is to code a spam filter from scratch that classifies messages with an accuracy greater than 90%.

Main files

This project contains 2 modules:

  • bayesian_classifier.py — BayesianClassifier class which is used to classify
    whether the message is spam or ham. It also calculates model score for accuracy.
  • main.py — main file for running the programm.

How does it work?

After running a program, Spam Classificator provides the user with 2 options:

  • to test a model on data base and get model score;
  • to test a model on statement the user inputs;

If there are many words in the statement that are not present in the database, then the
program will most likely output ‘needs human classification, probably spam’.

GitHub

View Github