I in the process of updating my deep learning course and books to make use of Keras. This posting contains some of the basic examples that I put together. This post is not meant to be an introduction to neural networks in general. For such an introduction, refer to either my books or this article.
I will expand on these examples greatly for both the book and course. The basic neural network operations that I needed were:
Simple Regression
Regression Early Stopping
Simple Classification
Classification Early Stopping
Deep Neural Networks w/Dropout and Other Regularization
Convolutional Neural Networks
LSTM Neural Networks
Loading/Saving Neural Networks
These are some of the most basic operations that I need to perform when working with a new neural network package. This provides me with a sort of Rosetta Stone for a new neural network package. Once I have these operations, I can more easily create additional examples that are more complex.
The first thing to check is what versions you have of the required packages:
1 2 3 4 5 6 7 8 9 10 11 12
import keras import tensorflow as tf import sys import sklearn as sk import pandas as pd
import pandas as pd from sklearn import preprocessing
# Encode text values to dummy variables(i.e. [1,0,0],[0,1,0],[0,0,1] for red,green,blue) defencode_text_dummy(df, name): dummies = pd.get_dummies(df[name]) for x in dummies.columns: dummy_name = "{}-{}".format(name, x) df[dummy_name] = dummies[x] df.drop(name, axis=1, inplace=True)
# Encode text values to indexes(i.e. [1],[2],[3] for red,green,blue). defencode_text_index(df, name): le = preprocessing.LabelEncoder() df[name] = le.fit_transform(df[name]) return le.classes_
# Convert all missing values in the specified column to the median defmissing_median(df, name): med = df[name].median() df[name] = df[name].fillna(med)
# Convert a Pandas dataframe to the x,y inputs that TensorFlow needs defto_xy(df, target): result = [] for x in df.columns: if x != target: result.append(x)
# find out the type of the target column. Is it really this hard? :( target_type = df[target].dtypes target_type = target_type[0] if hasattr(target_type, '__iter__') else target_type
# Encode to int for classification, float otherwise. TensorFlow likes 32 bits. if target_type in (np.int64, np.int32): # Classification dummies = pd.get_dummies(df[target]) return df.as_matrix(result).astype(np.float32), dummies.as_matrix().astype(np.float32) else: # Regression return df.as_matrix(result).astype(np.float32), df.as_matrix([target]).astype(np.float32)
Simple Regression
Regression is where a neural network accepts several values (predictors) and produces a prediction that is numeric. In this simple example we attempt to predict the miles per gallon (MPG) of several cars based on characteristics of those cars. Several parameters and used below and described here.
from keras.models import Sequential from keras.layers.core import Dense, Activation import pandas as pd import io import requests import numpy as np from sklearn import metrics
Now that the neural network is trained, we will test how good it is and perform some sample predictions.
1 2 3 4 5 6 7 8 9
pred = model.predict(x)
# Measure RMSE error. RMSE is common for regression. score = np.sqrt(metrics.mean_squared_error(pred,y)) print("Final score (RMSE): {}".format(score))
# Sample predictions for i in range(10): print("{}. Car name: {}, MPG: {}, predicted MPG: {}".format(i+1,cars[i],y[i],pred[i]))
1 2 3 4 5 6 7 8 9 10 11
Final score (RMSE): 3.494112253189087 1. Car name: chevrolet chevelle malibu, MPG: [ 18.], predicted MPG: [ 14.94231319] 2. Car name: buick skylark 320, MPG: [ 15.], predicted MPG: [ 14.08107567] 3. Car name: plymouth satellite, MPG: [ 18.], predicted MPG: [ 15.15124226] 4. Car name: amc rebel sst, MPG: [ 16.], predicted MPG: [ 15.84413433] 5. Car name: ford torino, MPG: [ 17.], predicted MPG: [ 15.11468124] 6. Car name: ford galaxie 500, MPG: [ 15.], predicted MPG: [ 10.48310184] 7. Car name: chevrolet impala, MPG: [ 14.], predicted MPG: [ 10.11642265] 8. Car name: plymouth fury iii, MPG: [ 14.], predicted MPG: [ 10.33946323] 9. Car name: pontiac catalina, MPG: [ 14.], predicted MPG: [ 10.317276] 10. Car name: amc ambassador dpl, MPG: [ 15.], predicted MPG: [ 12.37194347]
Regression (Early Stop)
Early stopping sets aside a part of the data to be used to validate the neural network. The neural network is trained with the training data and validated with the validation data. Once the error no longer improves on the validation set, the training stops. This prevents the neural network from overfitting.
import pandas as pd import io import requests import numpy as np from sklearn import metrics from keras.models import Sequential from keras.layers.core import Dense, Activation from keras.callbacks import EarlyStopping
Early stopping can also be used with classification. Early stopping sets aside a part of the data to be used to validate the neural network. The neural network is trained with the training data and validated with the validation data. Once the error no longer improves on the validation set, the training stops. This prevents the neural network from overfitting.
import pandas as pd import io import requests import numpy as np from sklearn import metrics from keras.models import Sequential from keras.layers.core import Dense, Activation from keras.callbacks import EarlyStopping
Show the predictions (raw, probability of each class.)
1 2 3 4 5 6
# Print out the raw predictions. Because there are 3 species of iris, there are 3 columns. The number in each column is # the probability that the flower is that type of iris.
np.set_printoptions(suppress=True) pred = model.predict(x_test) print(pred[0:10])
# The to_xy function represented the input in the same way. Each row has only 1.0 value because each row is only one type # of iris. This is the training data, we KNOW what type of iris it is. This is called one-hot encoding. Only one value # is 1.0 (hot)
# Using the predictions (pred) and the known 1-hot encodings (y_test) we can compute the log-loss error. # The lower a log loss the better. The probabilities (pred) from the previous section specify how sure the neural network # is of its prediction. Log loss error pubishes the neural network (with a lower score) for very confident, but wrong, # classifications. print(log_loss(y_test,pred))
1
0.133210815783
1 2 3 4 5 6 7 8 9
# Usually the column (pred) with the highest prediction is considered to be the prediction of the neural network. It is easy # to convert the predictions to the expected iris species. The argmax function finds the index of the maximum prediction # for each row.
# Accuracy might be a more easily understood error metric. It is essentially a test score. For all of the iris predictions, # what percent were correct? The downside is it does not consider how confident the neural network was in each prediction.
import pandas as pd import io import requests import numpy as np from sklearn import metrics from keras.callbacks import EarlyStopping from keras.layers import Dense, Dropout from keras import regularizers
model.fit(x,y,validation_data=(x_test,y_test),callbacks=[monitor],verbose=0,epochs=1000) pred = model.predict(x_test)
# Measure RMSE error. RMSE is common for regression. score = np.sqrt(metrics.mean_squared_error(pred,y_test)) print("Final score (RMSE): {}".format(score))
1 2
Epoch 00064: early stopping Final score (RMSE): 4.421816825866699
The Classic MNIST Dataset
The next examples will use the MNIST digits dataset. The previous examples used CSV files to load training data. Most neural network frameworks, such as Keras, have common training sets built in. This makes it easy to run the example, but hard to abstract the example to your own data. Your on data are not likely built into Keras. However, the MNIST data is complex enough that it is beyond the scope of this article to discuss how to load it. We will use the MNIST data build into Keras.
1 2 3 4 5 6 7 8
from keras.datasets import mnist (x_train, y_train), (x_test, y_test) = mnist.load_data()
print("Shape of x_train: {}".format(x_train.shape)) print("Shape of y_train: {}".format(y_train.shape)) print() print("Shape of x_test: {}".format(x_test.shape)) print("Shape of y_test: {}".format(y_test.shape))
1 2 3 4 5
Shape of x_train: (60000, 28, 28) Shape of y_train: (60000,)
Shape of x_test: (10000, 28, 28) Shape of y_test: (10000,)
1 2 3 4 5 6 7 8 9 10
# Display as image %matplotlib inline import matplotlib.pyplot as plt import numpy as np
digit = 101# Change to choose new digit
a = x_train[digit] plt.imshow(a, cmap='gray', interpolation='nearest') print("Image (#{}): Which is digit '{}'".format(digit,y_train[digit]))
1
Image (#101): Which is digit '7'
Convolutional Neural Networks
Convolutional Neural Networks are specifically for images. They have been applied to other cases; however, use beyond images is somewhat rarer than with images.
import keras from keras.datasets import mnist from keras.models import Sequential from keras.layers import Dense, Dropout, Flatten from keras.layers import Conv2D, MaxPooling2D from keras import backend as K
Long Short Term Memory is typically used for either time series or natural language processing (which can be thought of as a special case of natural language processing).
from keras.preprocessing import sequence from keras.models import Sequential from keras.layers import Dense, Embedding from keras.layers import LSTM from keras.datasets import imdb import numpy as np
max_features = 4# 0,1,2,3 (total of 4) x = [ [[0],[1],[1],[0],[0],[0]], [[0],[0],[0],[2],[2],[0]], [[0],[0],[0],[0],[3],[3]], [[0],[2],[2],[0],[0],[0]], [[0],[0],[3],[3],[0],[0]], [[0],[0],[0],[0],[1],[1]] ] x = np.array(x,dtype=np.float32) y = np.array([1,2,3,2,3,1],dtype=np.int32)
c:\users\jeffh\anaconda3\envs\tf-latest\lib\site-packages\ipykernel\__main__.py:27: UserWarning: The `input_dim` and `input_length` arguments in recurrent layers are deprecated. Use `input_shape` instead. c:\users\jeffh\anaconda3\envs\tf-latest\lib\site-packages\ipykernel\__main__.py:27: UserWarning: Update your `LSTM` call to the Keras 2 API: `LSTM(128, input_shape=(None, 1), recurrent_dropout=0.2, dropout=0.2)`
import pandas as pd import io import requests import numpy as np from sklearn import metrics from keras.models import Sequential from keras.layers.core import Dense, Activation from keras.models import load_model
# Measure RMSE error. RMSE is common for regression. score = np.sqrt(metrics.mean_squared_error(pred,y)) print("Before save score (RMSE): {}".format(score))
# save neural network structure to JSON (no weights) model_json = model.to_json() with open("network.json", "w") as json_file: json_file.write(model_json)
# save neural network structure to YAML (no weights) model_yaml = model.to_yaml() with open("network.yaml", "w") as yaml_file: yaml_file.write(model_yaml)
# save entire network to HDF5 (save everything, suggested) model.save("network.h5")
1
Before save score (RMSE): 3.276093006134033
1 2 3 4 5 6 7
from keras.models import load_model
model2 = load_model('network.h5')
# Measure RMSE error. RMSE is common for regression. score = np.sqrt(metrics.mean_squared_error(pred,y)) print("After load score (RMSE): {}".format(score))