Top 27+ Free Software for Text Analysis, Text Mining, Text Analytics: Review of Top 27 Free Software for Text Analysis, Text Mining, Text Analytics includingGeneral Architecture for Text Engineering ? GATE, RapidMiner Text Mining Extension, KH Coder, VisualText, Datumbox, TAMS, QDA Miner Lite, Carrot2, CAT, GATE, tm, Gensim, Natural Language Toolkit, Unstructured Information Management Architecture, OpenNLP, KNIME, Orange-Textable, LPU, Apache Mahout, Pattern, LingPipe, S-EM, LibShortText, Twinword, Apache Stanbol, Aika, Distributed Machine Learning Toolkit and Coh-Metrix.



Word embeddings are a modern approach for representing text in natural language processing. Embedding algorithms like word2vec and GloVe are key to the state-of-the-art results achieved by neural network models on natural language processing problems like machine translation. In this tutorial, you will discover how to train and load word embedding models for natural language ?


The Naive Bayes algorithm is simple and effective and should be one of the first methods you try on a classification problem. In this tutorial you are going to learn about the Naive Bayes algorithm including how it works and how to implement it from scratch in Python. Update: Check out the follow-up on tips for …


Probability is the measure of how likely an event is to occur out of the number of possible outcomes. This wikiHow will show you how to calculate different types of probabilities. Define your events and outcomes. Probability is the…

Training Set is a subset of the dataset used to build predictive models.
Validation Set is a subset of the dataset used to assess the performance of model built in the training phase
– It provides a test platform for fine-tuning model’s parameters and selecting the best performing model
– Not all modeling algorithms need a validation set
Test set or unseen examples is a subset of the dataset to assess the likely future performance of a model.
– If a model fits the training set much better than it fits the test set. Overfitting is probably the cause


true|false, 1|0, -1|+1, male|female

Multi-class classification problems can be seen as binary classification problems.

