## 20150725

### Caffe Install Notes

I have spent all too much time trying to get Caffe running on a CentOS based cluster that I use. I was hoping this would be a straightforward process. Suffice it to say it has not been. Not even close. None of the problems I encountered were particularly challenging to solve. The complication came from the fact that I ran into one hurdle after another. I should note however that installing Caffe on my personal machine which runs Linux Mint 17.1 went smoothly.

I'm writing this post as a record of the problems I encountered and the solutions I used. Unfortunately because I wasn't expecting to need to write this I may be missing details. If you notice something missing please feel free to leave a comment and I will update this document. Similarly, if you know of better ways to solve any of these problems please feel free to share. I will not likely test the solutions my self unless I need to reinstall for some reason so your comments would be largely for posterity. One thing to note is that some if not many of the problems I encountered may be rather specific too the cluster I'm using. My apologies if the following doesn't address the problem you are experiencing.

The first problem I ran into was with protobuf. The problem was related to the members of a unions being defined as constant in src/google/protobuf/util/internal/datapiece.h. Specifically the union defining the types i32_, i64_, u32_, u64_, double_, float_, bool_, and str_. This problem appears to be fairly common according to a quick Google search and the fix is as simple as removing the const keyword. However the error itself can be a bit misleading as it doesn't lead one specifically to the offending lines.

The next problem I encountered was related to glog. Specifically it requires a newer version of autotools than was installed on the system I'm using. To solve this I performed a user space install of autoconf and automake. It proved tricky to install autoconf for reasons I still don't understand. Thanks to the cluster admins I was able to finally do so using these instructions. Automake simply required downloading the tarball, configuring for my home directory, and then performing make followed by make install. Unfortunately glog was still not happy! The included make file hardcoded aclocal-1.14 and the upgraded autotools gave me autotools-1.15. Bah! Executing autoreconf -ivf fixed that issue though. I am not completely certain this is a good solution however as I have not used autoreconf before.

The next hurdle was with gflags. Evidently it requires cmake which was not already installed. Downloading and installing cmake resolved this problem. One thing to note when performing a user space install of cmake is that you need to use the --prefix flag when calling the boostrap script to indicate that you want cmake installed in your home directory. I also found that I had to compile with -fPIC, so the complete cmake command I ended up using was CXXFLAGS="-fPIC" cmake -DCMAKE_INSTALL_PREFIX=~ .. in a build subdirectory of the repository.

I also had to install OpenCV. I didn't run into any trouble here. But a word of warning for those that have never built it before -- it takes a very long time!

Next up was leveldb. In this case I just cloned the github repository and ran make in it. From there caffe needs to be told where to find the header files and shared objects that were built. I told it as much by appending to the INCLUDE_DIRS and LIBRARY_DIRS lists respectively in caffe/Makefile.config. The headers are in the include subdirectory of the repository while the libraries will be placed at the root of the repository.

From there I found I needed lmdb. This library is developed under OpenLDAP. At the time of this writing they offer a github repository with just the lmdb code so I cloned it and built the library. From there I updated the INCLUDE_DIRS and LIBRARY_DIRS lists in caffe/Makefile.config to point to libraries/liblmdb within the repository.

Next I had to install the Google snappy library. In this case I had to get the tarball from the Google code repository not the github repository. For reasons I don't know or really care about it seems that build files are missing from the gihub repository.

The last problem I ran into was related to Atlas. I had previously performed a user space install of it but did not add the resulting lib directory to my LD_LIBRARY_PATH and LIBRARY_PATH environment variables. Doing so allowed me to finally able to execute make all in the Caffe repository and have it complete without errors. It took a while though, in part because I was using a single thread since I never knew what it would stumble over next. As such I advise throwing more threads at it by instead using the command make all -jX where X is the number of threads you want it to use.

One final note on the installation. Don't forget to add the appropriate atlas, leveldb, and liblmdb directories to LD_LIBRARY_PATH in your .bashrc.

At this point I'm really hoping it was worth the effort to install Caffe. Comparatively the Theano and Pylearn2 installations were so much easier on this same system.

## 20150421

### Linux Mint 17.1, Nvidia, CUDA, and cuDNN

I recently replaced a Titan X, which was on loan, with a GTX 980. After messing with drivers for nearly a day I was able to get my dual monitor setup running again. Unfortunately whatever i did freaked out Theano yielding the error:
export LD_LIBRARY_PATH=/usr/local/cuda-7.0/lib64:$LD_LIBRARY_PATH export CUDA_ROOT=/usr/local/cuda-7.0 #### cuDNN Installing cuDNN for use with Theano can be found on the cuDNN page of deeplearning.net. I used the first method currently suggested on that page which is to copy *.h to$CUDA_ROOT/includes and *.so* to $CUDA_ROOT/lib64. ## 20141221 ### Monitoring Experiments in Pylearn2 In an earlier post I covered the basics of running experiments in Pylearn2. However I only covered the bare minimum commands required leaving out many details. One fundamental concept to running experiments in Pylearn2 is knowing how to monitor their progress, or "monitoring" for short. In this tutorial we will look at two forms of monitoring. The basic form which is always done and a new approach for real-time remote monitoring. #### Basic Monitoring We will build upon the bare-bones example from the previous tutorial which means we will be using the MNIST dataset. Most datasets have two or three parts. At a minimum they have a part for training and a part for testing. If a dataset has a third part its purpose is for validation, or measuring the performance of our learner without unduly biasing our learner towards the dataset. Pylearn2 performs monitoring at the end of each epoch and it can monitor any combination of the parts of the dataset. When using Stochastic Gradient Descent (SGD) as the training algorithm one uses the monitoring_dataset parameter to specify which parts of the dataset are to be monitored. For example, if we are only interested in monitoring the training set we would add the following entry to the SGD parameter dictionary: monitoring_dataset: { 'train': !obj:pylearn2.datasets.mnist.MNIST { which_set: 'train' } }  This will instruct Pylearn2 to calculate statistics about the performance of our learner using the training part of the dataset at the end of each epoch. This will change the default output after each epoch from: Monitoring step: Epochs seen: 1 Batches seen: 1875 Examples seen: 60000 Time this epoch: 2.875921 seconds  to: Monitoring step: Epochs seen: 0 Batches seen: 0 Examples seen: 0 learning_rate: 0.0499996989965 total_seconds_last_epoch: 0.0 train_objective: 2.29713964462 train_y_col_norms_max: 0.164925798774 train_y_col_norms_mean: 0.161361783743 train_y_col_norms_min: 0.158035755157 train_y_max_max_class: 0.118635632098 train_y_mean_max_class: 0.109155222774 train_y_min_max_class: 0.103917405009 train_y_misclass: 0.910533130169 train_y_nll: 2.29713964462 train_y_row_norms_max: 0.0255156457424 train_y_row_norms_mean: 0.018013747409 train_y_row_norms_min: 0.00823106430471 training_seconds_this_epoch: 0.0 Time this epoch: 2.823628 seconds  Each of the entries in the output (e.g. learning_rate, train_objective) are called channels. Channels give one insight into what the learner is doing. The two most frequently used are train_objective and train_y_nll. The channel train_objective reports the cost being optimized by training while train_y_nll monitors the negative log likelihood of the current parameter values. In this particular example these two channels are monitoring the same thing but this will not always be the case. Monitoring the train part of the dataset is useful for debugging purposes. However it is not enough alone to evaluate the performance of our learner because the learner will likely always improve and at some point it begins to overfit on the training data. In other words it will find parameters that work well on the data used to train it but not on data it has not seen during training. To combat this we use a validation set. MNIST does not explicitly reserve a part of the data for validation but it has become a de facto standard to use the last 10,000 samples from the train part. To specify this one uses the start and stop parameters when instantiating MNIST. If we were only monitoring the validation set our monitoring_dataset parameter to SGD would be: monitoring_dataset: { 'valid': !obj:pylearn2.datasets.mnist.MNIST { which_set: 'train', start: 50000, stop: 60000 } }  Note that the key to the dictionary, 'valid' in this case, is merely a label. It can be whatever we choose. Each channel monitored for the associated dataset is prepended with this value. It's also worth noting that we are not limited to monitoring just one part of the dataset. It is usually helpful to monitor both the train and validation parts of a data set. This is done as follows: monitoring_dataset: { 'train': !obj:pylearn2.datasets.mnist.MNIST { which_set: 'train', start: 0, stop: 50000 }, 'valid': !obj:pylearn2.datasets.mnist.MNIST { which_set: 'train', start: 50000, stop: 60000 } }  Note that here we use the start and stop parameters when loading both the train and valid parts to appropriately partition the dataset. We do not want the learner to validate on the data from the train dataset otherwise we will not be able to identify overfitting. Putting it all together our our complete YAML now looks like: !obj:pylearn2.train.Train { dataset: &train !obj:pylearn2.datasets.mnist.MNIST { which_set: 'train' start: 0, stop: 50000 }, model: !obj:pylearn2.models.softmax_regression.SoftmaxRegression { batch_size: 20, n_classes: 10, nvis: 784, irange: 0.01 }, algorithm: !obj:pylearn2.training_algorithms.sgd.SGD { learning_rate: 0.05, monitoring_dataset: { 'train': *train, 'valid': !obj:pylearn2.datasets.mnist.MNIST { which_set: 'train', start: 50000, stop: 60000 } } } } Note that here we have used a YAML trick to reference a previously instantiated object to save ourselves typing. Specifically the dataset has been tagged "&train" and when specifying monitor_dataset the reference "*train" is used to identify the previously instantiated object. #### Live Monitoring There are two problems with the basic monitoring mechanism in Pylearn2. First the output is raw text. This alone can make it difficult to understand how the values of the various channels are evolving in time. Especially when attempting to track multiple channels simultaneously. Second, due in part to the ability to add channels for monitoring, the amount of output after each epoch can and frequently does grow quickly. Combined these problems make the basic monitoring mechanism difficult to use. An alternative approach is to use a new mechanism called live monitoring. To be completely forthright the live monitoring mechanism is something that I developed to combat the aforementioned problems. Furthermore I am interested in feedback regarding its user interface and what additional functionality people would like. Please feel free to send an E-mail to the Pylearn2 users mailing list or leave a comment below with feedback. The live monitoring mechanism has two parts. The first part is a training extension, i.e. an optional plug-in that modifies the way training is performed. The second part is a utility class that can query the training extension for data about channels being monitored. Training extensions can be selected using the extensions parameter to the train object. In other words add the following to the parameters dictionary for the train object in any YAML: extensions: [ !obj:pylearn2.train_extensions.live_monitoring.LiveMonitoring {} ] The full YAML would look like: !obj:pylearn2.train.Train { dataset: &train !obj:pylearn2.datasets.mnist.MNIST { which_set: 'train' start: 0, stop: 50000 }, model: !obj:pylearn2.models.softmax_regression.SoftmaxRegression { batch_size: 20, n_classes: 10, nvis: 784, irange: 0.01 }, algorithm: !obj:pylearn2.training_algorithms.sgd.SGD { learning_rate: 0.05, monitoring_dataset: { 'train': *train, 'valid': !obj:pylearn2.datasets.mnist.MNIST { which_set: 'train', start: 50000, stop: 60000 } } }, extensions: [ !obj:pylearn2.train_extensions.live_monitoring.LiveMonitoring {} ] } The LiveMonitoring training extension listens for queries about channels being monitored. To perform queries one need only instantiate LiveMonitor and use it's methods to request data. Currently it has three methods: • list_channels: Returns a list of channels being monitored. • update_channels: Retrieves data about the list of specified channels. • follow_channels: Plots the data for the specified channels. This command blocks other commands from being executed because it repeatedly requests the latest data for the specified channels and redraws the plot as new data arrives. To instantiate LiveMonitor start ipython and execute the following commands: from pylearn2.train_extensions.live_monitoring import LiveMonitor lm = LiveMonitor()  Each of the methods listed above return a different message object. The data of interest is contained in the data member of that object. As such, given an instance of LiveMonitor, one would view the channels being monitored as follows: print lm.list_channels().data Which, if we're running the experiment specified by the YAML above, will yield: ['train_objective', 'train_y_col_norms_max', 'train_y_row_norms_min', 'train_y_nll', 'train_y_col_norms_mean', 'train_y_max_max_class', 'train_y_min_max_class', 'train_y_row_norms_max', 'train_y_misclass', 'train_y_col_norms_min', 'train_y_row_norms_mean', 'train_y_mean_max_class', 'valid_objective', 'valid_y_col_norms_max', 'valid_y_row_norms_min', 'valid_y_nll', 'valid_y_col_norms_mean', 'valid_y_max_max_class', 'valid_y_min_max_class', 'valid_y_row_norms_max', 'valid_y_misclass', 'valid_y_col_norms_min', 'valid_y_row_norms_mean', 'valid_y_mean_max_class', 'learning_rate', 'training_seconds_this_epoch', 'total_seconds_last_epoch'] From this we can pick channels to plot using follow_channels: lm.follow_channels(['train_objective', 'valid_objective'])  This command will then display a graph like that in figure 1 and continually updates the plot at the end of each epoch. Figure 1: Example output from the follow_channels method of the LiveMonitor utility object. The live monitoring mechanism is network aware and by default it answers queries on port 5555 of any network interface on the computer wherein the experiment is being executed. It is not necessary for a user to know anything about networking to use live monitoring however. By default the live monitoring mechanism assumes the experiment of interest is being executed on the same computer as the LiveMonitor utility class. If that is not the case and one knows the IP address of the computer on which the experiment is running then one need only specify the address when instantiating LiveMonitor. The live monitoring mechanism will automatically take care of the networking. Live monitoring is also very efficient. It only ever requests data it does not already have and the underlying networking utility waits for new data without taking unnecessary CPU time. The live monitoring mechanism has many benefits including: • The ability to filter the channels being monitored. • The ability to plot data for any given set of channels being monitored. • The ability to retrieve data from an experiment in real-time* • The ability to query for data from an experiment running on a remote machine. • The ability to change which channels are being followed or plotted without restarting an experiment. * Updates only occur at the end of each epoch but this is real-time with respect to Pylearn2 experiments. #### Conclusion Monitoring the progress of experiments in Pylearn2 is as easy as setting up an experiment. Monitoring is also very flexible and offers output both directly in the terminal as text or graphically via a training extension. ## 20141021 ### Thoughts Regarding the Michael Jordan Interview on IEEE Spectrum For the few that may not have already seen it Dr. Michael Jordan was interviewed by IEEE Spectrum recently. He offers commentary on a number of topics including computer vision, deep learning, and big data. Overall I found the article to be an interesting read though it seems to offer little new over what he said on his AMA on Reddit. Ultimately I find my self agreeing with his position on computer vision. Even given the major strides we have made as of late with convnets and the like we are still far from having a system as capable as we are at vision tasks. After all, the state-of-the-art challenge is the classification of just 1,000 classes of objects in high resolution images. This is a hard problem but it is something that we, humans, and many other animals do trivially. I am a bit torn about his perspective on deep learning. Notably because of the statement "it’s largely a rebranding of neural networks." I have encountered this idea a couple of times now but I argue that it is not accurate. It is true that neural networks are a favored tool amongst those in the deep learning community and that the strides made in the DL community have been seen while using NNs. But as Bengio et al. note in their forth-coming text called Deep Learning, it "involves learning multiple levels of representation, corresponding to different levels of abstraction." Neural networks have been shown to do this but it has not been shown that they are required to perform such a task. On the flip side, they are out performing other methods that could be used. Another point that stood out to me were is comments on the singularity. I find myself waffling on this topic and his comments help highlight the reason. Specifically he points out that discussions of the singularity are more philosophical in nature. I rather enjoy philosophy. I often say that if I had another life I would be a mathematician but if I had another one beyond that I would be a philosopher. More so than I am now anyway. I meet so many AI/ML people that think the singularity folks are just crackpots. And if we are being honest, there do seem to be more than a reasonable proportion of crackpots in the community. However that does not prevent us from approaching the topic with sound and valid argumentation. We just have to be prepared to encounter those that cannot or chose not. Edit 2014-10-23: It appears Dr. Jordan was a bit displeased with IEEE Spectrum interview as he explains in Big Data, Hype, the Media and Other Provocative Words to Put in a Title. The long and short of it appears to be that he believes his perspective was intentionally distorted for the reason that many of my colleagues have been discussing. Namely the title, and arguably the intro, imply much stronger claims than his subsequent comments in the article seem to allude to. As such he he felt the need to clarify his perspectives. On the one hand I though that a careful critical read of the interview allowed one to pick out his perspective fairly well. But in reading his response there appear to be some things that seem to come across just plain wrong. For instance his opinion about whether we should be collecting and exploring these large data sets. In the interview he makes the great point that we must be cognizant of bad correlations that can and will likely arise. But in the context I did get the impression that he was arguing against doing it all, i.e. collecting and analyzing such data sets, whereas in his response he argues that doing it can be a good thing because it can contribute to the development of principals that are currently missing. As a side note, I find it interesting that he did not link to the interview but instead gave a link to it. As if to say, let's not lend any more credibility to this article than is absolutely necessary. ## 20141018 ### A First Experiment with Pylearn2 Vincent Dumoulin recently wrote a great blog post titled Your models in Pylearn2 that shows how to quickly implement a new model idea in Pylearn2. However Pylearn2 has a fair number of models already implemented. This post is meant to compliment his post by explaining how to setup and run a basic experiment using existing components in Pylearn2. In this tutorial we will train a very simple single layer softmax regression model on MNIST, a database of handwritten digits. Softmax is a generalization of a binary predictor called logistic regression to the prediction of one of many classes. The task will be to identify which digit was written, i.e. classify the image into the classes 0-9. This same task is addressed in the Softmax regression Pylearn2 tutorial. This post will borrow from that tutorial. However Pylearn2 is feature rich allowing one to control everything from which model to train and which dataset to train it on to fine grained control over the training and the ability to monitor and save statistics about an experiment. For the sake of simplicity and understanding we will not be using most of them and as such this tutorial will be simpler. #### YAML Syntax A main goal of pylearn2 is to make managing experiments quick and easy. To that end a basic experiment can be executed by writing a description of the experiment in YAML (Yet Another Markup Language) and running the train script (pylearn2/scripts/train.py) on it. YAML is a markup language intended to be very sparse as compared to other markup languages such as XML. A run down of useful features for use with Pylearn2 can be found in the document YAML for Pylearn2 and the full specification can be found on yaml.org in case you need to something particularly out of the ordinary like defining a tuple. A Pylearn2 YAML configuration file identifies the object that will actually perform the training and the parameters it takes. I believe there is only one type of training object at the moment so it's kind of redundant but it allows for easy incorporation of special training procedures. The existing training object takes a specification of the model to be trained, the dataset on which the model should be trained, and the object representing the algorithm that will actually perform the training. Basic YAML syntax is extremely straight forward and the only special syntax that is really needed for the simplest of experiments is the !obj: tag. This is a Pylearn2 custom tag that instructs the Pylearn2 to instantiate a python object as specified immediately following the tag. For example the statement: !obj:pylearn2.datasets.mnist.MNIST { which: 'train' } results in the instantiation of the MNIST dataset class found amongst the various Pylearn2 datasets in pylearn2.datasets in the file mnist.py specifying a supplies the value 'train' for a parameter called which that identifies the portion (e.g. training, validation, or test) of the dataset that should be loaded via a python dictionary. Note that the quotes around the value 'train' are required as they indicate that the value is string which is the required data type for the 'which' parameter. It's important to note that any parameters required for the instantiation of a class must be provided in the associated dictionary. Check the Pylearn2 documentation for the class you need to instantiation to understand the available parameters and specifically which are required for the task you are attempting to perform. #### Defining an Experiment To define an experiment we need to define a train object and provide it a dataset object, a model object, and an algorithm object via its parameters dictionary. We have already seen how to instantiate the MNIST dataset class so lets look next at the algorithm. The Pylearn2 algorithm classes are found in the training_algorithms sub-directory. In this example we are going to use stochastic gradient descent (SGD) because it is arguably the most commonly used algorithm for training neural networks. It requires only one parameter, namely learning_rate, and is instantiated as follows: !obj:pylearn2.training_algorithms.sgd.SGD { learning_rate: 0.05 } The final thing we need to do before we can put it all together is to define a model. The Pylearn2 model classes are located in the model sub-directory. The class we want is called SoftmaxRegression and found in softmax_regression. In its most basic form we only need to supply four parameters: • nvis: the number of visible units in the network, i.e. the dimensionality of the input. • n_classes: the number of output units in the network, i.e. the number of classes to be learned. • irange: the range from which the initial weights should be randomly selected. This is a symmetric range about zero and as such it is only necessary to supply the upper bound. • batch_size: the number of samples to be used simultaneously during training. Setting this to 1 results in pure stochastic gradient descent whereas setting it to the size of the training set effectively results in batch gradient descent. Any value in between yields stochastic gradient descent with mini-batches of the size specified. Using what we know, we can now construct the train object and in effect the full YAML file as follows: !obj:pylearn2.train.Train { dataset: !obj:pylearn2.datasets.mnist.MNIST { which_set: 'train' }, model: !obj:pylearn2.models.softmax_regression.SoftmaxRegression { batch_size: 20, n_classes: 10, nvis: 784, irange: 0.01 }, algorithm: !obj:pylearn2.training_algorithms.sgd.SGD { learning_rate: 0.05 } } Note that a Pylearn2 YAML file can contain definitions for multiple experiments simultaneously. Simply stack them one after the other and they will be executed in order from top to bottom in the file. #### Executing an Experiment The final step is to run the experiment. Assuming the scripts sub-directory is in your path we simply call train.py and supply the YAML file created above. Assuming that file is called basic_example.yaml and your current working directory contains it the command would be: train.py basic_example.yaml Pylearn2 will load the YAML, instantiate the specified objects and run the training algorithm on the model using the specified dataset. An example of the output from this YAML looks like: dustin@Cortex ~/pylearn2_tutorials$ train.py basic_example.yaml
compiling begin_record_entry...
compiling begin_record_entry done. Time elapsed: 0.013530 seconds
Monitored channels:
Compiling accum...
Compiling accum done. Time elapsed: 0.000070 seconds
Monitoring step:
Epochs seen: 0
Batches seen: 0
Examples seen: 0
Time this epoch: 0:02:18.271934
Monitoring step:
Epochs seen: 1
Batches seen: 1875
Examples seen: 60000
Time this epoch: 0:02:18.341147
...
Note we have not told the training algorithm under what criteria it should stop so it will run forever!

Under the hood the Pylearn2 uses Theano to construct and train many of the models it supports. The first four lines of output, i.e. those related to begin_record_entry and accum, are related to this fact and can be disregarded for our purposes.

The rest of the output is related to Pylearn2's monitoring functionality. Since no channels, particular metrics or statics about the training, have been specified the rest of the output is rather sparse. There are no channels listed under the Monitor channels heading and the only things listed under the Monitoring step headings are those things common to all experiments (e.g. epochs seen, batches seen, and examples seen). The only other output is a summary of the time it took to train each epoch.

#### Conclusion

Pylearn2 to makes specifying and training models easy and fast. This tutorial looked at the most basic of models. However it does not discuss the myriad training and monitoring options provided by Pylearn2. Nor does it show how to build more complicated models like those with multiple layers as in multilayer perceptrons nor those with special connectivity patterns as in convolutional neural networks. My inclination is to continue in the next post by discussing the types of stopping criteria and how to use them. From there I would proceed to discussing the various training options and work my way towards more complicated models. However I'm amenable to the idea of changing this order if there is something of particular interest so let me know what you would like to see next.

## 20141013

### Harvard Librarians Advise Open Access Publishing

Excellent. The Harvard university librarians have written a letter the Harvard faculty and staff encouraging they start publishing in journals that make content free to the public, known as open access journals, as opposed to hidden behind a pay wall. I have been watching this debate for some time as a number of the UofU CS professors have been arguing for exactly this change.

I quite like the policy at the Machine Learning Lab here in Montreal which requires us to publish our articles on Arxiv.org, a database for freely publishing and accessing of scholarly works. It’s not without it’s challenges. For instance you never know the quality of a given paper that you find on Arxiv until you have invested time in reading it. Many arguing for the open access model have been actively trying to devise strategies for such problems. Regardless I believe it’s preferable to not having access to a paper that should probably be cited.

From a grad student's perspective it is nice because I don’t have to spend time submitting special requests for access to articles and then waiting to receive them. It could end up meaning that I have to pay to have my articles published but I personally prefer this because I want my work available to others to hopefully build upon.

## 20140413

### Installing RL-Glue and ALE without Root Access

In 2012 Bellemare, Naddaf, Veness, and Bowling [1] introduced the Arcade Learning Environment (ALE) for developing and evaluating general AI methods. It interfaces with an Atari 2600 simulator called Stella. They motivate their use of the the Atari 2600 because it permits access to hundreds of "game environments, each one different, interesting, and designed to be a challenge for human players." ALE also interfaces with RL-Glue, a collection of tools for developing and evaluating Reinforcement Learning agents.

RL-Glue is a two part system; a language agnostic part, referred to as the core, and a language specific part, referred to as the codec. There are multiple codecs supporting development in C/C++, Java, Lisp, Matlab, and Python.

The following discusses installing RL-Glue and ALE. First I cover installing RL-Glue and then ALE.

### Installing RL-Glue

The instructions for installing the RL-Glue core and Python codec are drawn from the technical manuals for versions 3.04 and 2.0 and assume versions 3.04 and 2.02 respectively. However, looking over older versions of the manual it appears these instructions have not changed much, if at all, which means they may also work for future versions with the correct changes in the commands. I leave it as an exercise to the reader to determine what changes are needed. (Don't you love it when an author does that?)

Installing RL-Glue in the user space as opposed to system wide requires compiling the core code from source. To do so, execute the following commands.

#### RL-Glue Core

$cd ~ && wget http://rl-glue-ext.googlecode.com/files/rlglue-3.04.tar.gz 2. Unpack RL-Glue Core $ tar -xvzf rlglue-3.04.tar.gz
3. Make a directory into which RL-Glue will be installed:
$mkdir ~/rlglue 4. Configure RL-Glue Core At this point it is necessary to identify the location wherein you wish to install the core as you must tell this to the configure script. I made a directory called rlglue in my home directory resulting in the following command: $ cd rlglue-3.04 && ./configure --prefix=<rlglue>
where <rlglue> is the absolute path to the directory into which you want RL-glue installed.
5. Build and install RL-Glue Core
$make && make install 6. Update your PATH Because the core has been installed in a non-standard location it is necessary to inform the system of it's location. This merely entails updating your PATH environment variable to include rlglue/bin, i.e.: $ export PATH=~/rlglue/bin:$PATH Note that you can make this change in your .bashrc to avoid doing it every time you open a new terminal. At this point executing rl_glue should result in: RL-Glue Version 3.04, Build 909 RL-Glue is listening for connections on port=4096 This indicates that RL-Glue is waiting for programs managing the agent, environment, and experiment to connect. For now you can type ctrl-c to exit the program. #### Python Codec 1. Download the Python Codec $ cd ~ && wget http://rl-glue-ext.googlecode.com/files/python-codec-2.02.tar.gz
2. Unpack the codec
$tar -xvzf python-codec-2.02.tar.gz 3. Update your PYTHONPATH This is another step that is only necessary because we're installing into a non-standard location. $ export PYTHONPATH=~/python-codec/src
Note that this is another command that can be placed into your .bashrc to avoid executing it every time you open a new terminal.

### Installing ALE

Installing ALE takes a little more effort. This is due to the fact that ALE does not supply a configure script to build a makefile specific for your system. These instruction are for installing ALE 0.4.3 and may not extend to newer versions well so your mileage may vary. As before, execute the following commands.

$wget http://www.arcadelearningenvironment.org/wp-content/uploads/2014/01/ale_0.4.3.zip 2. Unpack ALE $ unzip ale_0.4.3.zip
3. Select Makefile
ALE supports Linux, OSX, and Windows and as such a makefile is supplied for each platform. Installing ALE requires making a makefile from one of these. I advise copying the one you want as opposed to renaming it:
$cd ale_0.4.3/ale_0_4/ && cp makefile.unix makefile Update 2014-04-15: Frédéric Bastien notes that this step can be avoided by supplying the name of the preferred makefile to make on the command line as follows: $ make -f makefile.unix
Still be sure to change your working directory to ale_0.4.3/ale_0_4 as the following commands assume that context.

4. Enable RL-Glue support
RL-Glue support in ALE is disabled by default. To enable it edit the makefile and change the line:
USE_RLGLUE := 0
to
USE_RLGLUE := 1
It is also necessary to inform ALE where the RL-Glue headers are located. This can be done by changing the line:
INCLUDES := -Isrc/controllers -Isrc/os_dependent -I/usr/include -Isrc/environment
to
INCLUDES :=-Isrc/controllers -Isrc/os_dependent -I/usr/include -Isrc/environment -I<rlgluedir>/include
where <rlgluedir> indicates the directory into which rlglue was installed earlier.

Similarly it is necessary to inform ALE where the RL-Glue libraries are installed. This is done by changing the line:
LIBS_RLGLUE := -lrlutils -lrlgluenetdev
to
LIBS_RLGLUE := -L<rlgluedir>/lib -lrlutils -lrlgluenetdev
Update 2014-04-15: Frédéric Bastien notes that one can override these variables on the command line. For example:
$make USE_RLGLUE=1 sets USE_RLGLUE to 1. However it is unclear how to append to variables via the command line so this may only work for the USE_RLGLUE variable without restating the existing variable value. 5. Build ALE $ make
6. Update LD_LIBRARY_PATH
Because the RL-Glue libraries have been installed in a non-standard location it is necessary to tell ALE where to find them. This is done using the LD_LIBRARY_PATH environment variable as follows:
$export LD_LIBRARY_PATH=<rlgluedir>/lib:$LD_LIBRARY_PATH
Note that this is another command you can add to your .bashrc to avoid needing to execute the command every time you open a new terminal.

As before it is necessary to update your path to inform the system of the location of the ALE binaries since they are installed in a non-standard location.
$export PATH=~/ale_0.4.3/ale_0_4:$PATH
Note that this is yet another command you will have to execute every time you open a terminal unless you add this command to your .bashrc.

At this point you can execute ale which should result in:
A.L.E: Arcade Learning Environment (version 0.4)
Use -help for help screen.
Warning: couldn't load settings file: ./stellarc

Disregard the warnings, they are simply saying that ALE was unable to find your stella configuration and that a game ROM was not specified. This is to be expected since you did not specify the locations for them.

### Conclusion

This covers installing RL-Glue and ALE. I have not discussed anything beyond basic testing as such material can be found in the documentation for the tools.

In a future post, or possibly more than one, I will outline the processes for making and executing agents, environments, and experiments.

### References

[1] Bellemare, Marc G., et al. "The arcade learning environment: An evaluation platform for general agents." arXiv preprint arXiv:1207.4708 (2012).