Archivo de la categoría: Machine Learning

12/02/17: Machine Learning at Coursera: Week 10

Categoría: Machine Learning

Publicado por: Franco Cedillo Crisosto

No hay comentarios

Visto:912 veces

Large Scale Machine Learning - cousera machine learning week 10

Large Scale Machine Learning

e.g. Census data, Website traffic data
Can we train on 1000 examples instead of 100 000 000? Plot
If high variance, add more examples
If high bias, add extra features

Gradient Descent with Large Datasets

G.D. = batch gradient descent
Stochastic Gradient Descent
cost function = cost of theta wrt a specific example (x^i, y^i). Measures how well the hypothesis works on that example.
May need to loop over the entire dataset 1-10 times

Mini-Batch Gradient Descent

Batch gradient descent: Use all m examples in each iteration
Stochastic gradient descent: Use 1 example in each iteration
Mini-batch gradient descent: Use b examples in each iteration
typical range for b = 2-100 (10 maybe)
Mini-batch Gradient Descent allows vectorized implementation
Can partially parallelize the computation

Advanced Topics

Stochastic G.D. convergence

every 1000 iterations we can plot the costs averaged over te last 1000 examples
Learning Rate, smaller learning rate means smaller oscillations (plot)
average over more examples, 5000, may get a smoother curve
If curve is increasing, should use smaller learning rate
Learning Rate
alpha = const 1 / ( iterationNumer + const2 )

Online Learning

continuous stream of data
e.g. 1. shipping service, from origin and destination, optimize the price we offer
- x = feature vector (price, origin, destination)
  y = if they chose to use our service or not
e.g. 2. product search
- input: “Android phone 1080p camera”
- we want to offer 10 phones per query
- learning predicted click through rate (CTR)

Map Reduce and Data Parallelism

Hadoop
Use local CPU to look at local data
Massive data parallelism
Free text, unstructured data
sentiment analysis
NoSQL
MongoDB

Sources

10/02/17: Machine Learning at Coursera: Week 9

Categoría: Machine Learning

Publicado por: Franco Cedillo Crisosto

No hay comentarios

Visto:830 veces

Anomaly Detection

Anomaly_Detection-machine-learning-coursera

Density Estimation

Anomaly Detection

Gaussian distribution

e.g. Aircrafts engines features: heat generated, vibration intensity
e.g. servers: memory usage, cpu load, cpu load / network traffic

Building Anomaly Detection system
Developing and Evaluating
vs. Supervised Learning
What Features to Use

Multivariate Gaussian Distribution

Recommender Systems

Recommender-Systems-machine-learning-coursera

Content Based Recommendations

r(i, j): 1 if user j has rated movie i
y(i, j): rating user j to movie i

Collaborative Filtering

Symmetry breaking

Low-rank Matrix Factorization

24/01/17: Machine Learning at Coursera: Week 8

Categoría: Machine Learning

Publicado por: Franco Cedillo Crisosto

No hay comentarios

Visto:720 veces

Machine Learning at Coursera: Week 8

Unsupervised Learning & K-means

Clustering Algorithms, K-means Algorithm
Centroids
K-means for non-separated clusters
Random initialisation
Elbow method

Dimensionality Reduction

2D -> 1D
Data Compression to speedup training as well as visualizations of complex datasets
Indexes (e.g. GDP, Human Development Index)
Principal Component Analysis (PCA), projection
Data Preprocesing. Scaling, normalization
[U, S, V] = svd(sigma)
U = covariance matrix
Reconstruction from compressed representation

24/01/17: Machine Learning at Coursera: Week 7

Categoría: Machine Learning

Publicado por: Franco Cedillo Crisosto

No hay comentarios

Visto:743 veces

Machine Learning Coursera - SVM - Week 7

This week is about Support Vector Machine (SVM).

First we will learn about Large Margin Classification, in reference to the larger minimum distance from any of the training samples.

We will study Kernels, and the adaptation to non-linear classifiers.

Choosing landmarks will also be covered.

C parameter will be studied.

Similarity and Gaussian Kernels are also main keywords of this session.

We will get good advice about using SVM vs Logistic Regression vs Neural Networks.

30/12/16: Machine Learning at Coursera: Week 6

Categoría: Machine Learning

Publicado por: Franco Cedillo Crisosto

No hay comentarios

Visto:822 veces

Advice for Applying Machine Learning

Advice for Applying Machine Learning

What to try next? More samples? Smaller sets? More Complex features? Decreasing Lambda?

We will learn to use Test Sets and Cross Validation Set.

In this lesson is presented the powerful tool Bias vs. Variance.

Machine Learning System Design

Machine Learning System Design

Building a Spam Classifier, with this example we will learn to Prioritize what to work on and the Best use of our time.

Plotting learning curves will help us to grade our work and pivot our working path.

We will also learn to use a method for Error Analysis. Developing intuition with samples related to errors. Numerical evaluation would be an important tool for us.

For Skewed Classes we will get Precision and Recall. F1 Score measures the trade off between them.

29/12/16: Machine Learning at Coursera: Week 5

Categoría: Machine Learning

Publicado por: Franco Cedillo Crisosto

No hay comentarios

Visto:606 veces

Machine Learning at Coursera: Week 5

Fifth week at this course is about Neural Networks

The initial topic is the detailed study about Cost Function.

Back Propagation and Forward Propagation are also explained.

The second part of this session is about Back-propagation in Practice

The lesson covers Unrolling Parameters (into vectors). Using reshape in MATLAB.

Gradient Checking is explained and also it is recommended to turn it off for training.

Random Initialization is the method used for Symmetry Breaking.

Last part of session is about putting all these together.

17/10/16: Machine Learning at Coursera: Week 4

Categoría: Machine Learning

Publicado por: Franco Cedillo Crisosto

No hay comentarios

Visto:617 veces

Machine Learning at Coursera: Week 4

Week 4 was about Neural Networks

We started reviewing about Neuros and the Brain

The model representation was about Input Layer, Hidden Layers, Output Layer.

Units can be found in each layer

Multi-class Classification is an application of NNs

14/10/16: Machine Learning at Coursera: Week 3

Categoría: Machine Learning

Publicado por: Franco Cedillo Crisosto

No hay comentarios

Visto:623 veces

Machine Learning at Coursera: Week 3

Third week we started with Logistic Regression, used for Classification problem

In the context of Representation Model we studied the Cost Function and Gradient Descent

Around Multi-class Classification we reviewed One-vs-all

Solving the Problem of Overfitting, we used Regularization

13/10/16: Machine Learning at Coursera: Week 2

Categoría: Machine Learning

Publicado por: Franco Cedillo Crisosto

No hay comentarios

Visto:706 veces

Machine Learning at Coursera: Week 2

This week we started exploring MATLAB.

Multivariate Linear Regression

We study about multiple features for linear regression. It’s necessary to use Feature Scaling for a better regression. Adjusting the Learning Rate is another key action.

Polynomial Regression is a more complex model type.

Parameters can be analytically computed using Normal Equation.

MATLAB

Beyond Basic Operations, Plotting Data is a good resource to understand models, learning curves, algorithm behaviour.

Vectorization is also a necessary technique to get good algorithm efficiency.

13/10/16: Machine Learning at Coursera: Week 1

Categoría: Machine Learning

Publicado por: Franco Cedillo Crisosto

No hay comentarios

Visto:636 veces

ML at Coursera: Week 1

The first week there were presented the Supervised and Unsupervised Learning. Regression and Classification were studied as applications of Supervised Learning. Cluster detection was mentioned as an Unsupervised Learning application.

Linear Regression

There was explained the Model function and also the Cost function.
It was nice how Parameter Learning was obtained with Gradient Descent.

Linear Algebra Review

The review was about Vector and Matrices, operations such as Scalar Multiplication, Matrix-Matrix Multiplication, Inverse and Transpose matrix.

Blog de Franco

geopolítica, música, tecnología mobile

Archivo de la categoría: Machine Learning

12/02/17: Machine Learning at Coursera: Week 10

10/02/17: Machine Learning at Coursera: Week 9

24/01/17: Machine Learning at Coursera: Week 8

24/01/17: Machine Learning at Coursera: Week 7

30/12/16: Machine Learning at Coursera: Week 6

29/12/16: Machine Learning at Coursera: Week 5

17/10/16: Machine Learning at Coursera: Week 4

14/10/16: Machine Learning at Coursera: Week 3

13/10/16: Machine Learning at Coursera: Week 2

13/10/16: Machine Learning at Coursera: Week 1