20141221

Monitoring Experiments in Pylearn2

In an earlier post I covered the basics of running experiments in Pylearn2. However I only covered the bare minimum commands required leaving out many details. One fundamental concept to running experiments in Pylearn2 is knowing how to monitor their progress, or "monitoring" for short.

In this tutorial we will look at two forms of monitoring. The basic form which is always done and a new approach for real-time remote monitoring.

Basic Monitoring

We will build upon the bare-bones example from the previous tutorial which means we will be using the MNIST dataset. Most datasets have two or three parts. At a minimum they have a part for training and a part for testing. If a dataset has a third part its purpose is for validation, or measuring the performance of our learner without unduly biasing our learner towards the dataset.

Pylearn2 performs monitoring at the end of each epoch and it can monitor any combination of the parts of the dataset. When using Stochastic Gradient Descent (SGD) as the training algorithm one uses the monitoring_dataset parameter to specify which parts of the dataset are to be monitored. For example, if we are only interested in monitoring the training set we would add the following entry to the SGD parameter dictionary:

monitoring_dataset:
{
    'train': !obj:pylearn2.datasets.mnist.MNIST { which_set: 'train' }
}

This will instruct Pylearn2 to calculate statistics about the performance of our learner using the training part of the dataset at the end of each epoch. This will change the default output after each epoch from:

Monitoring step:
 Epochs seen: 1
 Batches seen: 1875
 Examples seen: 60000
Time this epoch: 2.875921 seconds

to:

Monitoring step:
 Epochs seen: 0
 Batches seen: 0
 Examples seen: 0
 learning_rate: 0.0499996989965
 total_seconds_last_epoch: 0.0
 train_objective: 2.29713964462
 train_y_col_norms_max: 0.164925798774
 train_y_col_norms_mean: 0.161361783743
 train_y_col_norms_min: 0.158035755157
 train_y_max_max_class: 0.118635632098
 train_y_mean_max_class: 0.109155222774
 train_y_min_max_class: 0.103917405009
 train_y_misclass: 0.910533130169
 train_y_nll: 2.29713964462
 train_y_row_norms_max: 0.0255156457424
 train_y_row_norms_mean: 0.018013747409
 train_y_row_norms_min: 0.00823106430471
 training_seconds_this_epoch: 0.0
Time this epoch: 2.823628 seconds

Each of the entries in the output (e.g. learning_rate, train_objective) are called channels. Channels give one insight into what the learner is doing. The two most frequently used are train_objective and train_y_nll. The channel train_objective reports the cost being optimized by training while train_y_nll monitors the negative log likelihood of the current parameter values. In this particular example these two channels are monitoring the same thing but this will not always be the case.

Monitoring the train part of the dataset is useful for debugging purposes. However it is not enough alone to evaluate the performance of our learner because the learner will likely always improve and at some point it begins to overfit on the training data. In other words it will find parameters that work well on the data used to train it but not on data it has not seen during training. To combat this we use a validation set. MNIST does not explicitly reserve a part of the data for validation but it has become a de facto standard to use the last 10,000 samples from the train part. To specify this one uses the start and stop parameters when instantiating MNIST. If we were only monitoring the validation set our monitoring_dataset parameter to SGD would be:

monitoring_dataset:
{
    'valid': !obj:pylearn2.datasets.mnist.MNIST {
        which_set: 'train',
        start: 50000,
        stop: 60000
    }
}

Note that the key to the dictionary, 'valid' in this case, is merely a label. It can be whatever we choose. Each channel monitored for the associated dataset is prepended with this value.

It's also worth noting that we are not limited to monitoring just one part of the dataset. It is usually helpful to monitor both the train and validation parts of a data set. This is done as follows:
monitoring_dataset:
{
    'train': !obj:pylearn2.datasets.mnist.MNIST {
        which_set: 'train',
        start: 0,
        stop: 50000
    },
    'valid': !obj:pylearn2.datasets.mnist.MNIST {
        which_set: 'train',
        start: 50000,
        stop: 60000
    }
}

Note that here we use the start and stop parameters when loading both the train and valid parts to appropriately partition the dataset. We do not want the learner to validate on the data from the train dataset otherwise we will not be able to identify overfitting.

Putting it all together our our complete YAML now looks like:

!obj:pylearn2.train.Train {
    dataset: &train !obj:pylearn2.datasets.mnist.MNIST {
        which_set: 'train'
        start: 0,
        stop: 50000
    },
    model: !obj:pylearn2.models.softmax_regression.SoftmaxRegression {
        batch_size: 20,
        n_classes: 10,
        nvis: 784,
        irange: 0.01
    },
    algorithm: !obj:pylearn2.training_algorithms.sgd.SGD {
        learning_rate: 0.05,
        monitoring_dataset:
        {
            'train': *train,
            'valid': !obj:pylearn2.datasets.mnist.MNIST {
                which_set: 'train',
                start: 50000,
                stop: 60000
            }
        }
    }
}

Note that here we have used a YAML trick to reference a previously instantiated object to save ourselves typing. Specifically the dataset has been tagged "&train" and when specifying monitor_dataset the reference "*train" is used to identify the previously instantiated object.

Live Monitoring

There are two problems with the basic monitoring mechanism in Pylearn2. First the output is raw text. This alone can make it difficult to understand how the values of the various channels are evolving in time. Especially when attempting to track multiple channels simultaneously. Second, due in part to the ability to add channels for monitoring, the amount of output after each epoch can and frequently does grow quickly. Combined these problems make the basic monitoring mechanism difficult to use.

An alternative approach is to use a new mechanism called live monitoring. To be completely forthright the live monitoring mechanism is something that I developed to combat the aforementioned problems. Furthermore I am interested in feedback regarding its user interface and what additional functionality people would like. Please feel free to send an E-mail to the Pylearn2 users mailing list or leave a comment below with feedback.

The live monitoring mechanism has two parts. The first part is a training extension, i.e. an optional plug-in that modifies the way training is performed. The second part is a utility class that can query the training extension for data about channels being monitored.

Training extensions can be selected using the extensions parameter to the train object. In other words add the following to the parameters dictionary for the train object in any YAML:

extensions: [
    !obj:pylearn2.train_extensions.live_monitoring.LiveMonitoring {}
]

The full YAML would look like:

!obj:pylearn2.train.Train {
    dataset: &train !obj:pylearn2.datasets.mnist.MNIST {
        which_set: 'train'
        start: 0,
        stop: 50000
    },
    model: !obj:pylearn2.models.softmax_regression.SoftmaxRegression {
        batch_size: 20,
        n_classes: 10,
        nvis: 784,
        irange: 0.01
    },
    algorithm: !obj:pylearn2.training_algorithms.sgd.SGD {
        learning_rate: 0.05,
        monitoring_dataset:
        {
            'train': *train,
            'valid': !obj:pylearn2.datasets.mnist.MNIST {
                which_set: 'train',
                start: 50000,
                stop: 60000
            }
        }
    },
    extensions: [
        !obj:pylearn2.train_extensions.live_monitoring.LiveMonitoring {}
    ]
}

The LiveMonitoring training extension listens for queries about channels being monitored. To perform queries one need only instantiate LiveMonitor and use it's methods to request data. Currently it has three methods:
  • list_channels: Returns a list of channels being monitored.
  • update_channels: Retrieves data about the list of specified channels.
  • follow_channels: Plots the data for the specified channels. This command blocks other commands from being executed because it repeatedly requests the latest data for the specified channels and redraws the plot as new data arrives.
To instantiate LiveMonitor start ipython and execute the following commands:

from pylearn2.train_extensions.live_monitoring import LiveMonitor
lm = LiveMonitor()

Each of the methods listed above return a different message object. The data of interest is contained in the data member of that object. As such, given an instance of LiveMonitor, one would view the channels being monitored as follows:

print lm.list_channels().data

Which, if we're running the experiment specified by the YAML above, will yield:

['train_objective', 'train_y_col_norms_max',
'train_y_row_norms_min', 'train_y_nll', 'train_y_col_norms_mean',
'train_y_max_max_class', 'train_y_min_max_class', 'train_y_row_norms_max',
'train_y_misclass', 'train_y_col_norms_min', 'train_y_row_norms_mean',
'train_y_mean_max_class', 'valid_objective', 'valid_y_col_norms_max',
'valid_y_row_norms_min', 'valid_y_nll', 'valid_y_col_norms_mean',
'valid_y_max_max_class', 'valid_y_min_max_class', 'valid_y_row_norms_max',
'valid_y_misclass', 'valid_y_col_norms_min', 'valid_y_row_norms_mean',
'valid_y_mean_max_class', 'learning_rate', 'training_seconds_this_epoch',
'total_seconds_last_epoch']

From this we can pick channels to plot using follow_channels:

lm.follow_channels(['train_objective', 'valid_objective'])

This command will then display a graph like that in figure 1 and continually updates the plot at the end of each epoch.

Figure 1: Example output from the follow_channels method of the LiveMonitor utility object.

The live monitoring mechanism is network aware and by default it answers queries on port 5555 of any network interface on the computer wherein the experiment is being executed. It is not necessary for a user to know anything about networking to use live monitoring however. By default the live monitoring mechanism assumes the experiment of interest is being executed on the same computer as the LiveMonitor utility class. If that is not the case and one knows the IP address of the computer on which the experiment is running then one need only specify the address when instantiating LiveMonitor. The live monitoring mechanism will automatically take care of the networking.

Live monitoring is also very efficient. It only ever requests data it does not already have and the underlying networking utility waits for new data without taking unnecessary CPU time.

The live monitoring mechanism has many benefits including:
  • The ability to filter the channels being monitored.
  • The ability to plot data for any given set of channels being monitored.
  • The ability to retrieve data from an experiment in real-time*
  • The ability to query for data from an experiment running on a remote machine.
  • The ability to change which channels are being followed or plotted without restarting an experiment.
* Updates only occur at the end of each epoch but this is real-time with respect to Pylearn2 experiments.

Conclusion

Monitoring the progress of experiments in Pylearn2 is as easy as setting up an experiment. Monitoring is also very flexible and offers output both directly in the terminal as text or graphically via a training extension.

20141021

Thoughts Regarding the Michael Jordan Interview on IEEE Spectrum

For the few that may not have already seen it Dr. Michael Jordan was interviewed by IEEE Spectrum recently. He offers commentary on a number of topics including computer vision, deep learning, and big data.

Overall I found the article to be an interesting read though it seems to offer little new over what he said on his AMA on Reddit.

Ultimately I find my self agreeing with his position on computer vision. Even given the major strides we have made as of late with convnets and the like we are still far from having a system as capable as we are at vision tasks. After all, the state-of-the-art challenge is the classification of just 1,000 classes of objects in high resolution images. This is a hard problem but it is something that we, humans, and many other animals do trivially.

I am a bit torn about his perspective on deep learning. Notably because of the statement "it’s largely a rebranding of neural networks." I have encountered this idea a couple of times now but I argue that it is not accurate. It is true that neural networks are a favored tool amongst those in the deep learning community and that the strides made in the DL community have been seen while using NNs. But as Bengio et al. note in their forth-coming text called Deep Learning, it "involves learning multiple levels of representation, corresponding to different levels of abstraction." Neural networks have been shown to do this but it has not been shown that they are required to perform such a task. On the flip side, they are out performing other methods that could be used.

Another point that stood out to me were is comments on the singularity. I find myself waffling on this topic and his comments help highlight the reason. Specifically he points out that discussions of the singularity are more philosophical in nature. I rather enjoy philosophy. I often say that if I had another life I would be a mathematician but if I had another one beyond that I would be a philosopher. More so than I am now anyway. I meet so many AI/ML people that think the singularity folks are just crackpots. And if we are being honest, there do seem to be more than a reasonable proportion of crackpots in the community. However that does not prevent us from approaching the topic with sound and valid argumentation. We just have to be prepared to encounter those that cannot or chose not.

Edit 2014-10-23: It appears Dr. Jordan was a bit displeased with IEEE Spectrum interview as he explains in Big Data, Hype, the Media and Other Provocative Words to Put in a Title. The long and short of it appears to be that he believes his perspective was intentionally distorted for the reason that many of my colleagues have been discussing. Namely the title, and arguably the intro, imply much stronger claims than his subsequent comments in the article seem to allude to. As such he he felt the need to clarify his perspectives.

On the one hand I though that a careful critical read of the interview allowed one to pick out his perspective fairly well. But in reading his response there appear to be some things that seem to come across just plain wrong. For instance his opinion about whether we should be collecting and exploring these large data sets. In the interview he makes the great point that we must be cognizant of bad correlations that can and will likely arise. But in the context I did get the impression that he was arguing against doing it all, i.e. collecting and analyzing such data sets, whereas in his response he argues that doing it can be a good thing because it can contribute to the development of principals that are currently missing.

As a side note, I find it interesting that he did not link to the interview but instead gave a link to it. As if to say, let's not lend any more credibility to this article than is absolutely necessary.

20141018

A First Experiment with Pylearn2

Vincent Dumoulin recently wrote a great blog post titled Your models in Pylearn2 that shows how to quickly implement a new model idea in Pylearn2. However Pylearn2 has a fair number of models already implemented. This post is meant to compliment his post by explaining how to setup and run a basic experiment using existing components in Pylearn2.

In this tutorial we will train a very simple single layer softmax regression model on MNIST, a database of handwritten digits. Softmax is a generalization of a binary predictor called logistic regression to the prediction of one of many classes. The task will be to identify which digit was written, i.e. classify the image into the classes 0-9.

This same task is addressed in the Softmax regression Pylearn2 tutorial. This post will borrow from that tutorial. However Pylearn2 is feature rich allowing one to control everything from which model to train and which dataset to train it on to fine grained control over the training and the ability to monitor and save statistics about an experiment. For the sake of simplicity and understanding we will not be using most of them and as such this tutorial will be simpler.

YAML Syntax

A main goal of pylearn2 is to make managing experiments quick and easy. To that end a basic experiment can be executed by writing a description of the experiment in YAML (Yet Another Markup Language) and running the train script (pylearn2/scripts/train.py) on it.

YAML is a markup language intended to be very sparse as compared to other markup languages such as XML. A run down of useful features for use with Pylearn2 can be found in the document YAML for Pylearn2 and the full specification can be found on yaml.org in case you need to something particularly out of the ordinary like defining a tuple.

A Pylearn2 YAML configuration file identifies the object that will actually perform the training and the parameters it takes. I believe there is only one type of training object at the moment so it's kind of redundant but it allows for easy incorporation of special training procedures. The existing training object takes a specification of the model to be trained, the dataset on which the model should be trained, and the object representing the algorithm that will actually perform the training.

Basic YAML syntax is extremely straight forward and the only special syntax that is really needed for the simplest of experiments is the !obj: tag. This is a Pylearn2 custom tag that instructs the Pylearn2 to instantiate a python object as specified immediately following the tag. For example the statement:
!obj:pylearn2.datasets.mnist.MNIST { which: 'train' }
results in the instantiation of the MNIST dataset class found amongst the various Pylearn2 datasets in pylearn2.datasets in the file mnist.py specifying a supplies the value 'train' for a parameter called which that identifies the portion (e.g. training, validation, or test) of the dataset that should be loaded via a python dictionary.

Note that the quotes around the value 'train' are required as they indicate that the value is string which is the required data type for the 'which' parameter.

It's important to note that any parameters required for the instantiation of a class must be provided in the associated dictionary. Check the Pylearn2 documentation for the class you need to instantiation to understand the available parameters and specifically which are required for the task you are attempting to perform.

Defining an Experiment

To define an experiment we need to define a train object and provide it a dataset object, a model object, and an algorithm object via its parameters dictionary.

We have already seen how to instantiate the MNIST dataset class so lets look next at the algorithm. The Pylearn2 algorithm classes are found in the training_algorithms sub-directory. In this example we are going to use stochastic gradient descent (SGD) because it is arguably the most commonly used algorithm for training neural networks. It requires only one parameter, namely learning_rate, and is instantiated as follows:
!obj:pylearn2.training_algorithms.sgd.SGD { learning_rate: 0.05 }
The final thing we need to do before we can put it all together is to define a model. The Pylearn2 model classes are located in the model sub-directory. The class we want is called SoftmaxRegression and found in softmax_regression. In its most basic form we only need to supply four parameters:
  • nvis: the number of visible units in the network, i.e. the dimensionality of the input.
  • n_classes: the number of output units in the network, i.e. the number of classes to be learned.
  • irange: the range from which the initial weights should be randomly selected. This is a symmetric range about zero and as such it is only necessary to supply the upper bound.
  • batch_size: the number of samples to be used simultaneously during training. Setting this to 1 results in pure stochastic gradient descent whereas setting it to the size of the training set effectively results in batch gradient descent. Any value in between yields stochastic gradient descent with mini-batches of the size specified.

Using what we know, we can now construct the train object and in effect the full YAML file as follows:
!obj:pylearn2.train.Train {
    dataset: !obj:pylearn2.datasets.mnist.MNIST { which_set: 'train' },
    model: !obj:pylearn2.models.softmax_regression.SoftmaxRegression {
        batch_size: 20,
        n_classes: 10,
        nvis: 784,
        irange: 0.01
    },
    algorithm: !obj:pylearn2.training_algorithms.sgd.SGD { learning_rate: 0.05 }
}
Note that a Pylearn2 YAML file can contain definitions for multiple experiments simultaneously. Simply stack them one after the other and they will be executed in order from top to bottom in the file.

Executing an Experiment

The final step is to run the experiment. Assuming the scripts sub-directory is in your path we simply call train.py and supply the YAML file created above. Assuming that file is called basic_example.yaml and your current working directory contains it the command would be:
train.py basic_example.yaml
Pylearn2 will load the YAML, instantiate the specified objects and run the training algorithm on the model using the specified dataset. An example of the output from this YAML looks like:
dustin@Cortex ~/pylearn2_tutorials $ train.py basic_example.yaml
compiling begin_record_entry...
compiling begin_record_entry done. Time elapsed: 0.013530 seconds
Monitored channels: 
Compiling accum...
Compiling accum done. Time elapsed: 0.000070 seconds
Monitoring step:
 Epochs seen: 0
 Batches seen: 0
 Examples seen: 0
Time this epoch: 0:02:18.271934
Monitoring step:
 Epochs seen: 1
 Batches seen: 1875
 Examples seen: 60000
Time this epoch: 0:02:18.341147
... 
Note we have not told the training algorithm under what criteria it should stop so it will run forever!

Under the hood the Pylearn2 uses Theano to construct and train many of the models it supports. The first four lines of output, i.e. those related to begin_record_entry and accum, are related to this fact and can be disregarded for our purposes.

The rest of the output is related to Pylearn2's monitoring functionality. Since no channels, particular metrics or statics about the training, have been specified the rest of the output is rather sparse. There are no channels listed under the Monitor channels heading and the only things listed under the Monitoring step headings are those things common to all experiments (e.g. epochs seen, batches seen, and examples seen). The only other output is a summary of the time it took to train each epoch.

Conclusion

Pylearn2 to makes specifying and training models easy and fast. This tutorial looked at the most basic of models. However it does not discuss the myriad training and monitoring options provided by Pylearn2. Nor does it show how to build more complicated models like those with multiple layers as in multilayer perceptrons nor those with special connectivity patterns as in convolutional neural networks. My inclination is to continue in the next post by discussing the types of stopping criteria and how to use them. From there I would proceed to discussing the various training options and work my way towards more complicated models. However I'm amenable to the idea of changing this order if there is something of particular interest so let me know what you would like to see next.

20141013

Harvard Librarians Advise Open Access Publishing

Excellent. The Harvard university librarians have written a letter the Harvard faculty and staff encouraging they start publishing in journals that make content free to the public, known as open access journals, as opposed to hidden behind a pay wall. I have been watching this debate for some time as a number of the UofU CS professors have been arguing for exactly this change.

I quite like the policy at the Machine Learning Lab here in Montreal which requires us to publish our articles on Arxiv.org, a database for freely publishing and accessing of scholarly works. It’s not without it’s challenges. For instance you never know the quality of a given paper that you find on Arxiv until you have invested time in reading it. Many arguing for the open access model have been actively trying to devise strategies for such problems. Regardless I believe it’s preferable to not having access to a paper that should probably be cited.

From a grad student's perspective it is nice because I don’t have to spend time submitting special requests for access to articles and then waiting to receive them. It could end up meaning that I have to pay to have my articles published but I personally prefer this because I want my work available to others to hopefully build upon.

20140413

Installing RL-Glue and ALE without Root Access

In 2012 Bellemare, Naddaf, Veness, and Bowling [1] introduced the Arcade Learning Environment (ALE) for developing and evaluating general AI methods. It interfaces with an Atari 2600 simulator called Stella. They motivate their use of the the Atari 2600 because it permits access to hundreds of "game environments, each one different, interesting, and designed to be a challenge for human players." ALE also interfaces with RL-Glue, a collection of tools for developing and evaluating Reinforcement Learning agents.

RL-Glue is a two part system; a language agnostic part, referred to as the core, and a language specific part, referred to as the codec. There are multiple codecs supporting development in C/C++, Java, Lisp, Matlab, and Python.

The following discusses installing RL-Glue and ALE. First I cover installing RL-Glue and then ALE.

Installing RL-Glue

The instructions for installing the RL-Glue core and Python codec are drawn from the technical manuals for versions 3.04 and 2.0 and assume versions 3.04 and 2.02 respectively. However, looking over older versions of the manual it appears these instructions have not changed much, if at all, which means they may also work for future versions with the correct changes in the commands. I leave it as an exercise to the reader to determine what changes are needed. (Don't you love it when an author does that?)

Installing RL-Glue in the user space as opposed to system wide requires compiling the core code from source. To do so, execute the following commands.

RL-Glue Core

1. Download RL-Glue Core
$ cd ~ && wget http://rl-glue-ext.googlecode.com/files/rlglue-3.04.tar.gz
2. Unpack RL-Glue Core
$ tar -xvzf rlglue-3.04.tar.gz
3. Make a directory into which RL-Glue will be installed:
$ mkdir ~/rlglue
4. Configure RL-Glue Core
At this point it is necessary to identify the location wherein you wish to install the core as you must tell this to the configure script. I made a directory called rlglue in my home directory resulting in the following command:
$ cd rlglue-3.04 && ./configure --prefix=<rlglue>
where <rlglue> is the absolute path to the directory into which you want RL-glue installed.
5. Build and install RL-Glue Core
$ make && make install
6. Update your PATH
Because the core has been installed in a non-standard location it is necessary to inform the system of it's location. This merely entails updating your PATH environment variable to include rlglue/bin, i.e.:
$ export PATH=~/rlglue/bin:$PATH
Note that you can make this change in your .bashrc to avoid doing it every time you open a new terminal.

At this point executing rl_glue should result in:
RL-Glue Version 3.04, Build 909
RL-Glue is listening for connections on port=4096
This indicates that RL-Glue is waiting for programs managing the agent, environment, and experiment to connect. For now you can type ctrl-c to exit the program.

Python Codec

1. Download the Python Codec
$ cd ~ && wget http://rl-glue-ext.googlecode.com/files/python-codec-2.02.tar.gz
2. Unpack the codec
$ tar -xvzf python-codec-2.02.tar.gz
3. Update your PYTHONPATH
This is another step that is only necessary because we're installing into a non-standard location.
$ export PYTHONPATH=~/python-codec/src
Note that this is another command that can be placed into your .bashrc to avoid executing it every time you open a new terminal.


Installing ALE

Installing ALE takes a little more effort. This is due to the fact that ALE does not supply a configure script to build a makefile specific for your system. These instruction are for installing ALE 0.4.3 and may not extend to newer versions well so your mileage may vary. As before, execute the following commands.

1. Download ALE
$ wget http://www.arcadelearningenvironment.org/wp-content/uploads/2014/01/ale_0.4.3.zip
2. Unpack ALE
$ unzip ale_0.4.3.zip
3. Select Makefile
ALE supports Linux, OSX, and Windows and as such a makefile is supplied for each platform. Installing ALE requires making a makefile from one of these. I advise copying the one you want as opposed to renaming it:
$ cd ale_0.4.3/ale_0_4/ && cp makefile.unix makefile
Update 2014-04-15: Frédéric Bastien notes that this step can be avoided by supplying the name of the preferred makefile to make on the command line as follows:
$ make -f makefile.unix
Still be sure to change your working directory to ale_0.4.3/ale_0_4 as the following commands assume that context.

4. Enable RL-Glue support
RL-Glue support in ALE is disabled by default. To enable it edit the makefile and change the line:
USE_RLGLUE := 0
to
USE_RLGLUE := 1
It is also necessary to inform ALE where the RL-Glue headers are located. This can be done by changing the line:
INCLUDES := -Isrc/controllers -Isrc/os_dependent -I/usr/include -Isrc/environment
to
INCLUDES :=-Isrc/controllers -Isrc/os_dependent -I/usr/include -Isrc/environment -I<rlgluedir>/include
where <rlgluedir> indicates the directory into which rlglue was installed earlier.

Similarly it is necessary to inform ALE where the RL-Glue libraries are installed. This is done by changing the line:
LIBS_RLGLUE := -lrlutils -lrlgluenetdev
to
LIBS_RLGLUE := -L<rlgluedir>/lib -lrlutils -lrlgluenetdev
Update 2014-04-15: Frédéric Bastien notes that one can override these variables on the command line. For example:
$ make USE_RLGLUE=1
sets USE_RLGLUE to 1. However it is unclear how to append to variables via the command line so this may only work for the USE_RLGLUE variable without restating the existing variable value.

5. Build ALE
$ make
6. Update LD_LIBRARY_PATH
Because the RL-Glue libraries have been installed in a non-standard location it is necessary to tell ALE where to find them. This is done using the LD_LIBRARY_PATH environment variable as follows:
$ export LD_LIBRARY_PATH=<rlgluedir>/lib:$LD_LIBRARY_PATH
Note that this is another command you can add to your .bashrc to avoid needing to execute the command every time you open a new terminal.

7. Update your PATH
As before it is necessary to update your path to inform the system of the location of the ALE binaries since they are installed in a non-standard location.
$ export PATH=~/ale_0.4.3/ale_0_4:$PATH
Note that this is yet another command you will have to execute every time you open a terminal unless you add this command to your .bashrc.

At this point you can execute ale which should result in:
A.L.E: Arcade Learning Environment (version 0.4)
[Powered by Stella]
Use -help for help screen.
Warning: couldn't load settings file: ./stellarc
No ROM File specified or the ROM file was not found.
Disregard the warnings, they are simply saying that ALE was unable to find your stella configuration and that a game ROM was not specified. This is to be expected since you did not specify the locations for them.

Conclusion

This covers installing RL-Glue and ALE. I have not discussed anything beyond basic testing as such material can be found in the documentation for the tools.

In a future post, or possibly more than one, I will outline the processes for making and executing agents, environments, and experiments.

References

[1] Bellemare, Marc G., et al. "The arcade learning environment: An evaluation platform for general agents." arXiv preprint arXiv:1207.4708 (2012).

20140404

What's Wrong with the Turing Test?

If I could answer one question through the course of my research it would be "what is intelligence?" This question like no other drives my studies. I wrote this post a while ago but did not post it. I did not post it because I intended to refine it. But the reality is I will always be refining my thoughts on this topic. Tonight I went out to the pub with several of my colleagues at Université de Montréal and this topic came up reminding me that I need to just put this out there. As such I am posting it now, with some small changes. I look forward to your responses.

The question "what is intelligence?" is non-trivial. We have been seeking an answer for millennia. While many definitions have been offered [1] no single definition has really ever dominated. Even in the sixty or so years that we have been seriously studying how to create an artificial intelligence we have never actually formally defined intelligence. Ask 100 people to define it and you will likely receive 100 different definitions [1], [2]. The de facto standard is the Turing test developed by Alan Turing [3]. History tells us that he was worried about getting mired in a long drawn out philosophical debate which would likely prevent any progress on actually creating an artificially intelligent being.

The Turing Test as it has come to be known is a variation on a game known as the imitation game [2] wherein a participant, the interrogator, attempts to identify which of the other two participants, one male and one female, was in fact male. Of course the female's objective was to fool the interrogator. The crux of the game was that the decision had to be made solely from communication patterns. In other words, the interrogator did not meet nor could they see the other participants. Additionally, to avoid cues from writing styles, messages would be passed between them in an anonymous way, such as through a computer terminal.

In the Turing Test the objective is to identify the artificial being [4], the computer, as opposed to the male. The hypothesis being that if the interrogator cannot differentiate the artificial being and the human, the artificial being must necessarily be intelligent if we accept that humans are intelligent. This test is clever because it does not require a definition of intelligence nor a measure of intelligence other than agreement that humans are intelligent. However the Turing Test is a behavioral test. Not everyone accepts the test. One of the more well known opponents is John Searle, a professor of philosophy at the University of California, Berkeley. Dr. Searle offers the Chinese Room argument in counter to the Turing Test.

According to the Chinese Room argument we can construct a process that appears intelligent but in fact is not. We do so by placing a person in a room, particularly one that does not speak Chinese. The resident will receive messages from outside the room written in Chinese. The resident must then consult a set of instructions that, when followed, dictate the appropriate response. One, that to any outside observer, would have to have been written by someone that knows Chinese. Since the resident does not know Chinese it supposedly follows that intelligence had only been imitated by the process.

There are a number of counter arguments to the Chinese Room argument. Some argue that the analogy breaks down as a result of the fact that a human performing the processing would simply take too long. Others argue that the room itself is intelligent. But I digress.

While I don't personally accept the Chinese Room argument I do agree there is a flaw in the Turing Test. Specifically, by the nature of its construction, it will permit us to classify a being as intelligent if it behaves like a human. From this we have to conclude that everything else is either not intelligent or at least not classifiable as intelligent without some added criterion.

This not only applies to the animals but to all other beings. Consider the scenario wherein we are visited by aliens that can talk to us, that can do incredibly complicated mathematics, even teach us a few things, and possess technologies way beyond our understanding such as a clearly engineered means of interstellar travel which they used to come to Earth. Would we consider these beings intelligent? We can all think of scenarios wherein the answer is "not necessarily" but in all likelihood we would agree that they are in fact intelligent. But how likely is it that the Turing test will apply?

Of course this problem applies to artificial beings as well, i.e. our computer programs. Have we already created an artificial intelligence? Some might argue we have with Cleverbot garnering 59.3% positive votes from 1,334 participants at the Techniche 2011 festival. Others would likely respond that the real turing test involves physical interaction, i.e. shaking hands with the being, and still not being able to discern a difference. This again highlights the problem.

A precise definition of intelligence would address this problem. However it would not only allow us to differentiate between intelligent and not and answer the question of whether we have already created an artificial intelligence. But it could allow for the development of metrics for comparing intelligences and even help us understand why our existing creations are not intelligent, if that is truly the case.

[1] Legg & Hutter, 2006, A Collection of Definitions of Intelligence, http://www.vetta.org/documents/A-Collection-of-Definitions-of-Intelligence.pdf
[2] Pfeifer, 1999, Understanding Intelligence
[3] Turing, 1950, Computing Machinery and Intelligence http://www.csee.umbc.edu/courses/471/papers/turing.pdf
[4] Russel and Norvig, 2009, Artificial Intelligence: A Modern Approach Third Edition

20140308

Day One in Canada

I crossed the US-Canada border today.

The process was fast and simple enough. However many of the "facts" on my visa were wrong and the gentleman at the border had to fix them. Evidently my last name was "Dustin James Webb" and I had no first or middle names, I was listed as a mechanical engineer not a computer scientist, and I was destined for Calgary, Alberta, not Montreal, Quebec. Oh, and I came to Canada in 2002, not today. I guess all the paper work we submitted to the NY consulate was for show as was the processing time for our visas. And no, my identity was not stolen. I won't go into the details on how I know this though.

Shortly after leaving the border I stopped for food. Ironically the first place I saw was a Tim Hortons. Naturally I had soup and a coffee to combat the cold. Even given all the warnings from friends and family about just how cold it is here I was still not prepared. It is insanely cold. And yes, I consider 30-40 degrees + large windchill factor to be insanely cold. I don't look forward to sub-zero temperatures.

Unfortunately I lost internet access on my phone at the border. Apparently T-mobile does not offer it to their US customers through their local partner. Worse yet I have had horrible cell reception. Google maps cached the info to get from the border to Toronto. But that took me downtown and I was unable to get direction anywhere else at that point. As such I set out to find a gas station and a hotel which are both very difficult tasks without reliable means of communication. This would not have been a problem if I had actually let my wife book my hotel when she had intended. As it is I am paying nearly twice what I should be for a hotel tonight. C'est la vie!

I also had fun trying to fill my tank. For reasons I still don't understand the gas pumps at the station I found would not accept our credit card. So I went into the store to pay in advance. The attendant asked how much I wanted so asked the price. He said it was 1.50 CAD!  I was astounded. I have been paying about $3.50 on average throughout this trip. But I shrugged it off and asked for $12 because our Prius only has an 8 gallon tank. It turns out they measure gas in litres here. Duh! Place this one in the category "should have seen that coming." Oh well, my tank is currently half full which is better than the gallon or so I had when I found the station.

Tomorrow, Montreal!

20140116

Making Learning Fun with Electronics and Robotics

In my last two posts I discussed my thoughts on making learning fun. In the first post the vehicle was games while in second post the vehicle was scientific experimentation. Electronics and robotics seem to me another possible vehicle. While I have a fair bit of experience with these topics I have not done much with them as a educational tool.

I'm most excited about using the Lego Mindstroms. This is Lego's robotics kits. While it is recommended for kids ages 10 and up it actually allows for construction of real autonomous robots. It comes with several sensors for measuring features of the environment such as light intensity, color, distance to the nearest object, audio, and the rotations of it's own motors. It introduces children to programming using a language called NXT-G which is a visual programming language. Think programming via legos. And of course it's compatible with all things Lego.

I look forward to use the Mindstorms to teach everything from mechanics and programming to how to use sensors and basic motion planning techniques.

My motivation for using the Mindstorms as an educational tool comes from my experience volunteering for First Lego League (FLL). FLL is a program for children between 9 and 15 years of age that promotes science and technology. Each year the kids are given a topic about which they must learn. To aid in the learning process they are given a large game board with lots of challenges built out of Lego pieces. The kids must then build one or more robots to solve the challenges. FLL is part of a larger program called FIRST, but is my favorite because the robots made by the kids are actually truly autonomous. For several years now I have volunteered as a robot design judge and through that effort I have seen just how excited the kids can get about learning and solving real world problems.

My children are not yet old enough to participate in FLL. However my son and I have started a Jr. First Lego League (Jr.FLL) team. Jr.FLL is like FLL but shoots to teach the basics of design, mechanics, research, and team building. In fact he and his team will be showing off they have learned about natural disasters on January 25th (2014) at the University of Utah Student Union building. Please feel free to come talk to them. I must warn you though, the FLL finals will be going on at the same time so the place will be an absolute mad house.

An alternative to the Mindstorms are Bo & Yana by iPlay. They are meant to teach the basics of programming, sensing, and actuation. We have not received ours yet but they look promising and are even compatible with other systems like the Mindstorms which should allow for a simple transition when the time is right.

Still another alternative are the solutions from Modular Robotics called Cubelets and their latest product called MOSS. These are just cubes with basic sensing and actuation capabilities that snap together using magnets. But in connecting them one is making simple robots. Unfortunately the MOSS Kickstarter came and went before I could get involved. If anyone has these, I would be interested in hearing about your experience with them. Specifically MOSS.

Of course to build a robot it helps to know something about electronics. One need not be an expert by any means but it helps to be able to build simple circuits. A few years ago I encountered a method for teaching children about circuits called Squishy Circuits. The foundation of this idea is to use "playdough", i.e. modeling compound, to make circuits. It turns out that if you make a modeling compound using a salt base it conducts electricity. Conversely, if you make a modeling compound with a sugar base it does not conduct electricity. As such you can make small sculptures from the two different types. Circuits can then be formed by connecting conductive portions of the sculpture with discrete components like a battery pack and LEDs.

There are a lot of other solutions out there for teaching children about electrics and circuits but I have no experience with any of them. Some of the ones I have found are:

Again, if you have any experience with these I would like to hear your thoughts.

20140108

Making Learning Fun with Experiments

In my last post I talked about how to take advantage of games to make learning fun. By no means is this an original idea. In fact it's rather obvious. Another approach which is probably again pretty obvious is to do experiments.

Their are numerous sites describing fun experiments. One that everyone seems to know is the classic baking-soda-vinegar volcano. For anyone that may be unfamiliar you take something like dirt or clay and sculpt a volcanic cone with an extra deep caldera. From there you pour in some baking-powder into the caldera. Finally you pour in some vinegar. The baking soda and vinegar react and discharge carbon dioxide in the form of bubbles that are heavier than air so, when done correctly, the bubbles overflow the volcano and spill down the sides much like a lava flow from a real volcano would.

This experiment is a great experiment because it opens the door to talking about all kinds of things from the structure of the earth, to how volcanoes form, to how volcanos can lead to other natural disasters like earth quakes and tsunamis, how pyroclastic flow can in a way preserve whatever it hits. For us the volcano devolved to just mixing baking soda and vinegar. After all, who doesn't like watching a vigorous chemical reaction? Even this is great though as it opens the door for talking about things like pressure, surface tension, and chemistry. In a similar vein, the Coke and Mentos experiment is always fun to do. Last I read the reaction wasn't well understood but it's clear that the process is releasing a lot of gas in short order.

Another experiment we have had fun with is the basic electromagnet. Again for those that are unfamiliar you wrap a length of wire around something like an iron bolt and connect the two ends of the wire to the opposite ends of a battery. It is important that the object around which you wrap the wire is ferrous. We usually call this object the core. The movement of the electrons through the wire then produces an electromagnetic field which is amplified by whatever you use for your core.  From there you can use whatever other magnetic objects you have lying around to show the formation and destruction of the magnetic field as you connect and disconnect the battery.

It is not necessary for the experiment to seem obviously fun though. Take for instance testing soil types. In this experiment you gather a couple of soil samples from different places. Potting soil and dirt from outside are great. You put the potting soil in a jar, the dirt in another, and then mix of both into yet a third. Then fill the jars with water, cap them, and shake. What do you get? Mud! Of course the educational part comes from the discussion that ensues when everything settles and talking about how long it takes to settle. My son and I played with this experiment for a couple of days.

As I eluded to earlier, the list of possible experiments is endless. They often lead to the same discussions but that doesn't make them any less fun. To close out this post I will leave you with some links to other particularly fun experiments:
  • Rock Candy: This one takes a bit longer but you get a treat in the end.
  • Fluorescent Jello: Not so tasty, but it glows in the dark!
  • Squishy Circuits: This is great for introducing children to both chemistry and electronics.

20140101

Making Learning Fun with Games

In the last few days I have had a couple of conversations with friends about teaching children. These conversations have inspired me to write a bit about my thoughts on the topic. My experience basically comes from teaching my own son. I strongly believe in the need for parents to supplement their childs education. To the point that I try to work with my kids a little bit every day.

As anyone with children will likely tell you it can be difficult at times to maintain their interest. Even if you are incredibly passionate about a topic it can be challenging to imbue them with that passion. One thing I have noticed while working with my son is that he gets excited when the solutions come easily. Conversely, failure to immediately understand quickly leads to disinterest. This would seem to imply a need for instant gratification. For that reason I often seek ways of removing that need or replacing that need in some way.

One obvious but still great method to address this problem while still providing a lesson is through games. Any gamer will tell you as much. Not just because they are attempting to justify their pastime but because any games provide myriad lessons in the guise of entertainment. Two of my favorite games are DragonBox and LightBot.

DragonBox teaches the principles of algebra without focusing on the mathematical foundations. It simply challenges the child with a puzzle that involves isolating an object from a set of others using the rules of algebra. Early in the game it doesn't even use numbers, just pictures, so that the child may focus on the rules. Unfortunately there isn't enough content. My son has beat this game numerous times and basically lost interest.

LightBot teaches the basics of programming. The objective is to get a robot to turn on lights placed throughout an environment. The crux is that the series of commands needed to execute the task must be provided before the robot ever does anything. As with all games it starts off simply and increases in complexity. In this case the complexity comes from restricting the number of commands the player can use, requiring the use of subprocedures, requiring the application of recursion, and the like.

Another game we have been playing is from MindSnacks. Specifically we've been studying French to help our son prepare for entry into the Montreal educational system where at least a third of the class is taught strictly in French. In total it has nine subgames but only permits the child access to two at the beginning. The child must gain levels to unlock the others. It also focuses the child on certain aspects of the topic of study. In the case of French, and probably the other languages it supports, the topics include numbers, colors, days of the week, and greetings for a total of 50 different topics.

Of course games don't have to be deemed educational to in fact be educational. One of my son's favorite games is Minecraft. He prefers creative mode and will play for hours constructing little houses and zoos for all the animals he hatches. I don't particularly care for the game myself but it has a lot of great educational aspects to it. For instance it is great for talking about Geometry, from the different types of shapes we study to the difference between 1D, 2D, and 3D. Because it is in part focused on crafting it also offers a segue into talking about how real things are made.

Another "non-educational" game that I like is StarMade. This one is inspired by Minecraft but takes place in space. The objective is to make a spacecraft and fly it around collecting materials and fighting space pirates. I particularly like this one because it is more challenging than Minecraft but my son finds it interesting enough to work through the challenges. For instance, he would prefer to play and have to practice his reading to accomplish his goal than not. This is significant because he has not yet found that reading for the sake of reading is fun. It also offers additional lessons to those found in Minecraft. For instance, building a spacecraft requires an understanding of the different parts including the different computers required (e.g. control computer, weapons computers) to engines and shielding.

Games are not the only method for making learning fun. I also like to use experimentation to bring lessons to life but I'll talk more about this in my next entry. I have also been looking for a way to introduce my son to real world electronics and robotics. As part of this we have created a Jr. FLL but this is limited to simple machines and designing solutions. There are a lot of products being made to go beyond this which I will also discuss later.