What is statistics?

Statistics is the art of dealing with Data of:
a. large amounts
b. error-tagged
Statistics is divided into two parts:
16 Introductory Data Analysis – HIS
1. Descriptive statistics:
• Calculating characteristic values: mean, median, standard deviation
• Graphical representation
2. Pedictive statistics: that tries to predict facts about large populations or about manufacturing
processes from observed/ measured facts in a small (produced) sample.

Review of Top 27 Free Software for Text Analysis, Text Mining, Text Analytics including General Architecture for Text Engineering – GATE, RapidMiner Text Mining Extension, KH Coder, VisualText, Datumbox, TAMS, QDA Miner Lite, Carrot2, CAT, GATE, tm, Gensim, Natural Language Toolkit, Unstructured Information Management Architecture, OpenNLP, KNIME, Orange-Textable, LPU, Apache Mahout, Pattern, LingPipe, S-EM, LibShortText, Twinword, Apache Stanbol, Aika, Distributed Machine Learning Toolkit and Coh-Metrix.



Word embeddings are a modern approach for representing text in natural language processing. Embedding algorithms like word2vec and GloVe are key to the state-of-the-art results achieved by neural network models on natural language processing problems like machine translation.


