Subscribe / Unsubscribe Enewsletters | Login | Register

Pencil Banner

Machine learning for Java developers

Gregor Roth | Sept. 18, 2017
Set up a machine learning algorithm and develop your first prediction function in Java.

public static double cost(Function<Double[], Double> targetFunction,
                          List<Double[]> dataset,
                          List<Double> labels) {
   int m = dataset.size();
   double sumSquaredErrors = 0;

   // calculate the squared error ("gap") for each training example and add it to the total sum
   for (int i = 0; i < m; i++) {
      // get the feature vector of the current example
      Double[] featureVector = dataset.get(i);
      // predict the value and compute the error based on the real value (label)
      double predicted = targetFunction.apply(featureVector);
      double label = labels.get(i);
      double gap = predicted - label;
      sumSquaredErrors += Math.pow(gap, 2);

   // calculate and return the mean value of the errors (the smaller the better)
   return (1.0 / (2 * m)) * sumSquaredErrors;

Training the target function

Although the cost function helps to evaluate the quality of the target function and theta parameters, respectively, you still need to compute the best-fitting theta parameters. You can use the gradient descent algorithm for this calculation.

Gradient descent

Gradient descent minimizes the cost function, meaning that it's used to find the theta combinations that produces the lowest cost (J(θ)) based on the training data.

Here is a simplified algorithm to compute new, better fitting thetas:

f4 gradient regression

Within each iteration a new, better value will be computed for each individual θ parameter of the theta vector. The learning rate α controls the size of the computing step within each iteration. This computation will be repeated until you reach a theta values combination that fits well. As an example, the linear regression function below has three theta parameters:

f5 example linear regression

Within each iteration a new value will be computed for each theta parameter: θ0, θ1, and θ2 in parallel. After each iteration, you will be able to create a new, better-fitting instance of the LinearRegressionFunction by using the new theta vector of 0, θ1, θ2}.

Listing 4 shows Java code for the gradient descent algorithm. The thetas of the regression function will be trained using the training data, data labels, and the learning rate (α). The output of the function is an improved target function using the new theta parameters. The train() method will be called again and again, and fed the new target function and the new thetas from the previous calculation. These calls will be repeated until the tuned target function's cost reaches a minimal plateau:


Previous Page  1  2  3  4  5  6  7  8  9  10  11  Next Page 

Sign up for Computerworld eNewsletters.