What is this?

This distribution provides a set of modules, and a single client, gx,
to implement GeneXplorer, which is a web interface for browsing
hierarchically clustered data.

The latest version of this module should always be available from:

http://search.cpan.org/dist/Microarray-GeneXplorer/

An example of this software in action is available at:

http://microarray-pubs.stanford.edu/cgi-bin/gx?n=prostate1&rx=5

PREREQUISITES

GD.pm

GeneXplorer relies on the GD module (http://search.cpan.org/dist/GD/)
for image creation.  If the GD module support png image creation
(version > 1.19), then it will create png images, otherwise gif images
will be created.

INSTALLATION

Firstly, read all the installation instructions before you begin!

To install this software, you will need some limited Unix experience,
and some limited Perl experience.  If in doubt, consult your local
Perl guru or systems administrator.  You also need access to a Unix
machine, such as MacOSX, or a Windows machine running Cygwin, a Linux
box, a Sun machine etc..

To install the software do the following after downloading the
distribution:

First decompress the gzipped tarfile, e.g:

   tar zxvf Microarray-GeneXplorer-0.1.tar.gz

or:

   gunzip Microarray-GeneXplorer-0.1.tar.gz
   tar xvf Microarray-GeneXplorer-0.1.tar

Then change into the created directory:

   cd Microarray-GeneXplorer-0.1


To install the actual Perl modules themselves, and the executable
correlations and makeMicroarrayDataset.pl programs, type the following
(note, the 'make install' step requires administrative privileges -
see your sysadmin if you don't have them):

     perl Makefile.PL
     make
     make test
     make install

If you don't have administrative privileges, or want to install the
modules and excutables (correlations and makeMicroarrayDataset.pl)
into a location other than the default location on your machine, then
use the following:

    perl Makefile.PL INSTALLDIRS=site INSTALLSITELIB=/home/your/private/dir/lib INSTALLSCRIPT=/home/your/private/bin
    make
    make install

Replace /home/your/private/dir/lib with the full path to the directory
you want the libraries placed in, and /home/your/private/bin with the
full path to the directory that you want the executables placed in.
Note that if you install the libraries in a non-standard place, prior
to doing the install step you should edit the
bin/makeMicroarrayDataset.pl program to include that path (with a 'use
lib' statement), so that the installed version can find the library.

See the documentation for ExtUtils::MakeMaker for more details about
how to modify the behaviour during the make process:

http://search.cpan.org/dist/ExtUtils-MakeMaker/

A Note About Directories

Programs within the GeneXplorer distribution are picky about directory
structures, in that they expect a certain directory structure to
exist when a dataset is created (see below), and when the contents of
a dataset are subsequently used by the gx application.  You will need
to have a web accesible directories where images can be read from over
http for general images used by GeneXplorer, images used for a
particular dataset, and images that are created on a temporary basis.
You will also need a cgi-bin directory where the gx application itself
will exist, and a directory in which the dataset files can be stored
and read by the cgi application.  Your set up needs to look something
like this:

/<rootpath>/
		html/
			tmp/		# for temporary images
			explorer/	# under which images specific
					  for a dataset are created
				images/ # for general GeneXplorer images

		cgi-bin/		# gx resides here
		data/			
			explorer/	# datafiles for a dataset reside here
		lib/

in the above example, when creating a dataset, a directory, based on
the name of the dataset, would be created under each of :

/<rootpath>/html/explorer/
/<rootpath>/data/explorer/

where files specific to that dataset will be stored.

The gx application assumes that there exists a rootpath (server_root
above), under which will be an html directory, and underneath that
there will be a tmp and an explorer directory.  This is somewhat
inflexible, and will be made more configurable in a future release.

The bottom line is that you MUST create a directory structure that
looks like the one above, where the html directory is the DOCUMENT_ROOT.

What is a displayConfig file?

It is a file that allows you to specify some of the look and feel of 
gx, and to where certain things should link etc.  Some examples are 
in the data/explorer directory.

Using GeneXplorer

Once you have installed the Perl modules, and compiled the
correlations binary and placed it in an appropriate location, you are
almost ready to start using GeneXplorer.  There are two stages to
using GeneXplorer:

1.  Creating a dataset.

To create a dataset, you will need a cdt file, whose format is
described at:

http://genome-www5.stanford.edu/MicroArray/help/formats.shtml#cdt

A cdt file is created using a program to hierarchically cluster
microarray expression data.  Many programs exist to do this
hierarchical clustering, e.g.:

Cluster:      http://rana.lbl.gov/EisenSoftware.html

Cluster 3.0 : http://bonsai.ims.u-tokyo.ac.jp/~mdehoon/software/cluster/index.htm

XCluster    : http://genetics.stanford.edu/~sherlock/cluster.html

There are cdt sample files in the data/sample/ directory of this
distribution.

Once you have a cdtfile, you can now use the makeMicroarrayDataset.pl
program to create a dataset from it.  Before you run it, you need to
make sure of two things:

i.  The correlations binary that you installed must be in your path.

ii. The location where you just installed the perl modules must be in your
    path, or edit makeMicroarrayDataset.pl to add a 'use lib'
    statement such that it can find them.

There are several command-line options for makeMicroarrayDataset.pl
(execute it with no parameters, and it will print out its usage
information):

makeMicroarrayDataset.pl -file <file/name> -name <intended/dataset/name> \
[-rootpath <your_root_directory> -contrast <float> -colorscheme <rg|yb> -verbose]

        -file           = (required) input file (currently only '.cdt' files supported)

        -name           : (required) dataset name to be created
                          (may be delimited by slashes(/) to imply hierarchy)

        -rootpath       : root directory, under which must exist html
			  and data directories
                          
        -contrast       : (optional) contrast value for the generated images
                          (defaults to 4, As the data are expected to
                           be in log base 2, this corresponds to a
                           16-fold change as the maximum color in any
                           image)

        -colorscheme    : (optional) color scheme used for generating the images
                          (rg = red/green, yb = yellow/blue ; defaults to yellow/blue)

        -verbose        : (optional) whether show feedback messages during run
			  (no value required - simply provide the parameter)

	-help		: (optional) whether to print usage information
			  (no value required - just provide the parameter)

eg:

makeMicroarrayDataset.pl -file mydata.cdt -name mydata -rootpath /<rootpath>/ -imageout -verbose -contrast 2 -colorscheme rg

The program will create 7 files in:

        /<rootpath>/data/explorer/<name>/

with the following suffixes:

                .expt_info .feature_info, .data_matrix, .binCor, .meta

(these files must be readable by a gx application running out of cgi-bin)

and two files in:

	/<rootpath>/html/explorer/<name>/

with the following suffixes:

		.data_matrix.gif, .expt_info.gif

In addition, the dataset creation will expect a tmp directory to exist
at:

	/<rootpath>/html/tmp/

Once your dataset has been created, it should be usable by the gx
application.

2. Move some files in the distribution into place

a) The displayConfig files

   cp data/explorer/* /<rootpath>/data/explorer/

b) Some images

   cp html/explorer/images/* /<rootpath>/html/explorer/images/

3. Use the gx application

You will need to place the gx application in your cgi-bin directory,
and make it executable.  You will also need to make some modifications
to the gx application so that it knows where certain things exist that
are within your system:

a) Libraries

gx has to be able to find the Perl libraries that you installed with
the 'make install' command above.  If you installed them in a typical
system wide fashion, then it is likely that they are in the path of
the cgi script.  If you installed them in a specific place, then you
will need to add a :

use lib '/home/your/private/dir/lib';

in gx, before the:

use Microarray::CdtDataset;
use Microarray::Explorer;
use Microarray::Config;

lines.  If you installed them in a lib directory immediately below the
server root, at the same level as your html directory, then simply
uncomment the lines in gx that read:

#use File::Basename;
#use lib dirname($ENV{DOCUMENT_ROOT})."/lib";

which should locate the libraries for you.

b) Paths

You need to put into the gx file the your rooturl and rootpath so it
knows where to find the files and link to.

You also need to place the images that are provided in the
html/explorer/images directory of this distribution into the
<rootpath>/html/explorer/images/ directory on your server.


To then actually use gx, you should be able to type in your web
browser a url such as:

http://<rooturl>/cgi-bin/gx?n=<name>

where <name> is the name that you gave your dataset when creating it
with makeMicroarrayDataset.pl.

For creation of subsequent datasets, as long as you are using the same
rootpath directory, then you should be able to not worry about this
step 2, and after creating your dataset should be able to simply
change the n parameter passed to gx, and it will just work.