Solving for problem

K={1, 1.1, 5, 5.1, 1.5, 5.2, 7.9, 1.2, 8.1, 9}
Total item=10

iter1:
m1=5 m2=9
K1={1, 1.1, 5, 5.1, 1.5, 5.2, 1.2} K2={7.9, 8.1, 9}
m1=2.87==approx(3) m2=8.333==approx(9)

K-means clustering Algorithm for manually finding from observation:

Step 1: Take mean value

Step 2: Find nearest number of mean and put in cluster

Step 3: Repeat one and two until we get same mean

K={2,3,4,10,11,12,20,25,30}

k=2

m1=4   m2=12

k1={2,3,4,10}

m1=3

k2={10,11,12,20,25,30}

m2=108/6=18

 

k1={2,3,4,10}

m1=4.75==approc(5)

k2={11,12,20,25,30}

m2=19.6==approx(20)

Iter:
K1={2,3,4,10,11,12} K2={20,25,30}
m1=7 m2=25

k1={2,3,4,10,11,12} k2={20,25,30}

m1=7, m2-25

Same mean twice. Thus we are getting same mean we have to stop.

Data Engineer Track

https://towardsdatascience.com/who-is-a-data-engineer-how-to-become-a-data-engineer-1167ddc12811

https://medium.com/datadriveninvestor/python-vs-r-choosing-the-best-tool-for-ai-ml-data-science-7e0c2295e243

 

ArrayList in Java

ArrayList in Java:

Generics:

 

LinkedList is faster for manipulation but slower for retrieval and ArrayList is slower for manipulation but faster for retrieval

 

Data Mining: Unit 9

Ensemble Models:

Basics
Boosting
Ranmdom Forests

Support Vector Machines
Basics
Linear Classification
Nonlinear Classification
Properties of SVMs

Discriminant Analysis
Basics

 

Exercise:

We are going to create some data mining models for classification and compare their performance. The goal with our models is still.

 

Data Mining: Exercise 8

Design of network topology

Determine:

Number of input nodes
Too few nodes => misclassification
Too many nodes=> overfitting

 

Problems with dollar sign:

https://stackoverflow.com/questions/42560090/what-is-the-meaning-of-the-dollar-sign-in-r-function

Problem with tilde sign:

https://stackoverflow.com/questions/14976331/use-of-tilde-in-r-programming-language?noredirect=1&lq=1

 

 

Java serious OOP Practice

Problem statement: Write a program to simulate a car dealership sales process. We will have employees, working selling vehicles to customers.

you have to think about nouns(a person, name or thing) while doing OOP, we have 4 nouns in the problem statement.

Dealership.java

Customer.java

Vehicle.java

Employee.java

Output:
if false in purchaseCar()

if true in purchaseCar()

 

 

 

 

Interface, Abstract Class and Polymorphism Example

Animal.java class

Fish.java

Zoo.java

Sparrow.java

Chicken.java

Bird.java

Flyable.java

 

 

 

 

My Java OOP example from Lessons

Animal.java:

Zoo.java

Fish.java

Bird.java

 

My Beautiful Crafted Java OOP

Human.java

Earth.java

 

mach= imperative form of machen
=anmachen=put on,turn on
=aufmachen=open,undo,open up
=ausmachen=put out, turn off, make out
= mitmachen=join in, take part in
=zumachen=close,shut down
=klarmachen=explain,make clear

 

anfang=beginning
am anfang=at the begininning
anfang Mai=early in May

 

vernahm=heard
Geräusch=noise
verdächtige=suspect
verlassen = leave,exit,quit
gestohlener = stolen
entdeckte= discover
kippte =Tip given
entschlossen = Determined
Kübel =Bucket
Dieb=Thief
fassen=take
aufregend = exciting
ausnahmsweise = exceptionally

 

hinlegen
hingelegt=lay down, settle down
eingebildet
herausstellte=emphasize
ungewöhnliche = unusual, uncommon

Na warte mal=wait a minute
sprang=jump
gefreut = happy
gestört =disturbed
geil=hot

 

tatsächlich = Indeed, in fact
entdeckte = discover
beschloss = decided, resolve
einbrach
gerade  =just

 

Empfindung = sensation, perception, feeling, emotion

bezeichnen-Denote,constitute,name,signify
wörtlicher-literal,verbal

Darstellung=Presentation, exposition
inneren = Interior
Monolog-
Trauen=Sad
Pochte
pochen=Knock
Wieder = again
Einatmen= breathe in

 

Sachtexte verstehen-Understanding the factual texts
Etwas definieren-Define Something
Meinungen äußern-Express opinions

German Word Learning

Sternlein
Schafelein
Schenk
Schelle
Spielgeselle
Lammerlein
Schelle
Umfrage
Vertraumt
Vorgange
häufigsten
verstorben
altagstresse
Abstürzen
Gedanken
grummelt zwar der Volksmund
Dennoch
Traumforscher
errechnet
Befragt
Ebenfalls
wurdest
Zwar
anschauen
Schaum
entfernt
Wie heißt das so schön?
niemand
daraus
jeweils
dass
das
etwa
bemerkt
das
fähig
echt
echt

Unit 5- Multiple Linear Regression

iT IS HAPPEN WHEN MORE THAN ONE POSSIBLE PREDICTOR VARIABLE.

including more than one independent variable in the regression model, makes us extend the simple linerar regression model to a multiple linear regression model.

Advantages:
Relationship between response variables and several predictors simultaneously.

Disadvantages:
Model building , interpration difficulties due to complexity.

Multiple linear regression with two predictors:

Y=beta0+beta1X1+beta2X2+epsylon
where, Y is the dependent variable.
X1,X2…Xk are predictors(independent variables)
Epsylon is the random error
beta1, beta2, beta0 are unknown regression coefficients

Example=> oil consumption:

Y=oil consumption(per month)
X1=outdoor temperature

X2=size of house(in meter square)

Model:

Y=beta0+beta1X1+beta2X2+epsylon

now beta1 is expected change in Y(oil consiumption) at one unit increase in X1(outdoor temperature), when all other predictors are kept constant, i.e. in this case the size of the house is not changed.

beta1 is estimated with beta1=-27.2 degree C

 

Assumptions:

The random error term epsylon is normally distributed and has mean zero. i.e. E(epsylon)=0

Epsylon has (unknown) variance sigma epsylon^2. i.e. all random errors have the same variance.

Adjusted R^2
R^2adj=1- SSE/(n-k-1)/SST/(n-1)

 

 

As for simple linear regression:

plots of residual against y prime
plots of residuals against xi
normal probability plot of residuals
plots of residuals in observation order
Cook’s distance
Studentized residuals
Standardized residuals
Dffits

Collinearity:
Can only occur for multiple regression.
Predictors explaining the same variation of the response variabl.

Oil consumption continued:
One predictor measuring house size in cm^2 and another predictor in m^2
Variance inflation factor

VIFi=1/1-Ri^2

Condition Index for collinearity:
between 10 and 30=>weak collinearity
between 30 and 100=>moderate
collinearity>100=>strong collinearity

Example of Oil consumption continued:
Assume that we would like to use outdoor temperature X1 and house size X2 as predictors. Additionally, we want to use a third predictor:

X3={1 if extra-thick walls, 0 otherwise

Model:
Y=beta0+beta1X1+beta2X2+beta3X3+epsylon

Model Selection Strategies:
Mldel ranked using R^2, adjusted R^2 or mallow’s Cp
Stepwise selection methods:
Backward, forward, stepwise selection

r^2 Selection
In a data set with 7 possible predictors, there would be 2^7-1=127 possible regression models.
For every model size(k=1,2,…..,p) look at, let say, m models, chosen

Mallow’s Cp:
Large Cp=>biased model
it’s a formula.
where MSEp=mean squared error for a model with p parametes
mean squared error for the full model
n=number of observations

Design Pattern and Object oriented Data Analaysis

Writing here to clearing engineering concept of a software for future reference.

Example for Class, Constructors and Methods:

Main.java

User.java

Interface Example

 

 

 

Programming Problem Solve with Python:

Problem: Only Positive Numbers

positiveFunc([-5,3,-1,101])