Deep Learning Demo with Caffe on Amazon EC2 Ubuntu 16.04 LTS

With so many deep learning(DL) libraries (Caffe, DeepLearning4j, H2O to name few) available and so many variations of environments such as operating systems, computing platform (CPU,GPU or distributed computing), it is very easy to feel lost when you start to get your hands dirty with DL. This tutorial is for you if you wish to get started with practical deep learning with only a decent internet connection and average laptop/desktop.

We will discuss setting up Caffe on Ubuntu 16.04 machine on Amazon EC2 cloud.

Why to choose Caffe? as DL starting point?

Many reasons, but the most important for me to choose Caffe is that it allows you to get the feel of large models even when you don't have the great computational capabilities, specifically GPU computing in your armory that most DL frameworks expect.

I found out that it is possible to download the GPU trained models and do atleast preliminary experiments with them on CPU only machine with Caffe. Following are the simple steps to have a basic DL setup with just a Windows Laptop and internet connection.

1. Setting up Amazon EC2 Ubuntu 16.01 instance

I mostly followed the amazon documentation.

http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EC2_GetStarted.html

But if you are a total newbie for setting up any free cloud setup, here are the steps I followed.

Sign up for Amazon EC2. You need debit/credit card. If you wish to experiment for free. You need to make sure that the steps you follow are for Amazon "Free Tier". If you are vary of inadvertently using Amazon's paid services. Make sure to start Billing alerts. If you are a research student, it would be better to apply for research credit.
Create an Instance. I have chosen Ubuntu 16.04 LTS instance.
Launch the instance from EC2 console web interface.
One the instance is running, you can connect to it using Putty(if you have Windows on your Laptop) or SSH (openssh on Linux). For Linux, the public key created at the time of creating amazon instance (.pem file) is required. If you are using Putty, you need to first convert the .pem file to Putty specific .ppk file using the Putty Keygen.

2. Setting Caffe

For setting up Caffe, I mostly followed Caffe's official documentation on

https://github.com/BVLC/caffe/wiki/Ubuntu-16.04-or-15.10-Installation-Guide

After installing the requisite libraries mentioned in the tutorial, the the next step is to execute the following code ( from the ~/deep-learning/caffe directory) to build Caffe. The ~/deep-learning/caffe is the directory where you have unpacked the archive downloaded from https://github.com/BVLC/caffe.

make all

make test

make runtest

make pycaffe      

make distribute

I found that all above steps to build Caffe work fine on the standard free tier Amazon EC2 instance except the step

$ make pycaffe

3. Issues while making pycaffe

As Caffe is a C++ library, it provides interfaces for development in other languages; the cmdcaffe, pycaffe, and matcaffe.

If you find that the step $ make pycaffe is taking longer time and eventually is timing out, then you will get the following error and as a result the caffe.so will not be created successfully.

g++: internal compiler error: Killed (program cc1plus)
Please submit a full bug report

If you get this issue, it is likely due to the low memory (The amazon EC2 free tier instance has 1GB by default). The issue is fixed as per this post.

free
dd if=/dev/zero of=/var/swap.img bs=1024k count=1000
mkswap /var/swap.img
swapon /var/swap.img
free

Another way to sort this issue is to start the Amazon instance with more memory.

4. Testing the Caffe setup

For testing the Caffe setup, you can try different benchmark tasks. I followed the tutorial that explains how to run Google Inceptionism code on Caffe.

http://hanzratech.in/2015/07/27/installing-caffe-on-ubuntu.html.

The Google Inceptionism code uses Caffe to generate images that help us to visualize how the deep network learns (or what it dreams) with each iteration.The code for Inceptionism is discussed at

https://research.googleblog.com/2015/06/inceptionism-going-deeper-into-neural.html

You can clone the Inceptionism code on the Amazon EC2 instance by following the tutorial.

To run the Inceptionism code locally, following commands can be used

cd ~/deep-learning/deepdream
ipython notebook

The interactive notebook(by default) runs on local notebook server and can be accessed at http://localhost:8888. But since we are setting up notebook the Ubuntu 16.04 box on Amazon EC2, it is not possible to test http://localhost:8888 as Amazon EC2 gives us command line access to the UBuntu Box and does not have X windows access.

If you wish to run the interactive python notebooks on remote server (such as your Ubuntu instance on Amazon EC2), you need to install a IPython/Jupyter notebook server.

5. Setting Jupyter notebook server

For starting the IPython(aka Jupyter) as server, I have followed the following tutorial for starting Jupyter on Amazon EC2 Free tier.

https://gist.github.com/iamatypeofwalrus/5183133

The tutorial configures Amazon EC2 instance by configuring ports and security certificates.

For running a IPython notebook server on this instance, we need to open access to standard port 8888. On Amazon EC2 console, you will need to create new security group that gives permissions for ports 22(SSH, 443(HTTPS) and 8888(Jupyter notebook server) to be accessed from any IP address.
Next step is to configure the IPython Notebook Server.
A self signed SSL certificate is first created so that the notebook server can be securely accessed over HTTPS.
The configurations for notebook server such as the location of SSL certificate, ports and allowed hosts are entered in ipython_config.py (Please refer the tutorial

Now we can start the interactive python server with following command:

jupyter notebook ./ inline --config="~/.ipython/profile_nbserver/ipython_config.py"

Please note that if you wish your notebook server to be on all the time, please run above command in background mode by appending &.

5. Testing Google Inceptionism code

For testing, you can access your Amazon EC2 instance from browser either by the public IP address or by the public DNS name. You can come to know about the public IP of your instance from the EC2 Console.

If XX.XX.XXX.XXX is your public IP address, you can use following commands to see the listing of all IPython notebooks as:

https://ec2-XX-XX-XXX--XXX-south-1.compute.amazonaws.com:8888

https://XX.XX.XXX.XXX:8888

If you have configured the default IPython notebook for Caffe demo then you will see the directory tree showing all notebooks. You can choose the dream.ipynb to see the Inceptionism code in action.

Please drop me an email if you wish to play around the Inceptionism code on my Amazon EC2 instance.

6. References

http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EC2_GetStarted.html

http://hanzratech.in/2015/07/27/installing-caffe-on-ubuntu.html.

https://github.com/BVLC/caffe/wiki/Ubuntu-16.04-or-15.10-Installation-Guide

https://gist.github.com/iamatypeofwalrus/5183133
https://research.googleblog.com/2015/06/inceptionism-going-deeper-into-neural.html

Deep Experiments

Search This Blog