What is this? This distribution provides a set of modules, and a single client, gx, to implement GeneXplorer, which is a web interface for browsing hierarchically clustered data. The latest version of this module should always be available from: http://search.cpan.org/dist/Microarray-GeneXplorer/ An example of this software in action is available at: http://microarray-pubs.stanford.edu/cgi-bin/gx?n=prostate1&rx=5 PREREQUISITES GD.pm GeneXplorer relies on the GD module (http://search.cpan.org/dist/GD/) for image creation. If the GD module support png image creation (version > 1.19), then it will create png images, otherwise gif images will be created. INSTALLATION Firstly, read all the installation instructions before you begin! To install this software, you will need some limited Unix experience, and some limited Perl experience. If in doubt, consult your local Perl guru or systems administrator. You also need access to a Unix machine, such as MacOSX, or a Windows machine running Cygwin, a Linux box, a Sun machine etc.. To install the software do the following after downloading the distribution: First decompress the gzipped tarfile, e.g: tar zxvf Microarray-GeneXplorer-0.1.tar.gz or: gunzip Microarray-GeneXplorer-0.1.tar.gz tar xvf Microarray-GeneXplorer-0.1.tar Then change into the created directory: cd Microarray-GeneXplorer-0.1 To install the actual Perl modules themselves, and the executable correlations and makeMicroarrayDataset.pl programs, type the following (note, the 'make install' step requires administrative privileges - see your sysadmin if you don't have them): perl Makefile.PL make make test make install If you don't have administrative privileges, or want to install the modules and excutables (correlations and makeMicroarrayDataset.pl) into a location other than the default location on your machine, then use the following: perl Makefile.PL INSTALLDIRS=site INSTALLSITELIB=/home/your/private/dir/lib INSTALLSCRIPT=/home/your/private/bin make make install Replace /home/your/private/dir/lib with the full path to the directory you want the libraries placed in, and /home/your/private/bin with the full path to the directory that you want the executables placed in. Note that if you install the libraries in a non-standard place, prior to doing the install step you should edit the bin/makeMicroarrayDataset.pl program to include that path (with a 'use lib' statement), so that the installed version can find the library. See the documentation for ExtUtils::MakeMaker for more details about how to modify the behaviour during the make process: http://search.cpan.org/dist/ExtUtils-MakeMaker/ A Note About Directories Programs within the GeneXplorer distribution are picky about directory structures, in that they expect a certain directory structure to exist when a dataset is created (see below), and when the contents of a dataset are subsequently used by the gx application. You will need to have a web accesible directories where images can be read from over http for general images used by GeneXplorer, images used for a particular dataset, and images that are created on a temporary basis. You will also need a cgi-bin directory where the gx application itself will exist, and a directory in which the dataset files can be stored and read by the cgi application. Your set up needs to look something like this: /<rootpath>/ html/ tmp/ # for temporary images explorer/ # under which images specific for a dataset are created images/ # for general GeneXplorer images cgi-bin/ # gx resides here data/ explorer/ # datafiles for a dataset reside here lib/ in the above example, when creating a dataset, a directory, based on the name of the dataset, would be created under each of : /<rootpath>/html/explorer/ /<rootpath>/data/explorer/ where files specific to that dataset will be stored. The gx application assumes that there exists a rootpath (server_root above), under which will be an html directory, and underneath that there will be a tmp and an explorer directory. This is somewhat inflexible, and will be made more configurable in a future release. The bottom line is that you MUST create a directory structure that looks like the one above, where the html directory is the DOCUMENT_ROOT. What is a displayConfig file? It is a file that allows you to specify some of the look and feel of gx, and to where certain things should link etc. Some examples are in the data/explorer directory. Using GeneXplorer Once you have installed the Perl modules, and compiled the correlations binary and placed it in an appropriate location, you are almost ready to start using GeneXplorer. There are two stages to using GeneXplorer: 1. Creating a dataset. To create a dataset, you will need a cdt file, whose format is described at: http://genome-www5.stanford.edu/MicroArray/help/formats.shtml#cdt A cdt file is created using a program to hierarchically cluster microarray expression data. Many programs exist to do this hierarchical clustering, e.g.: Cluster: http://rana.lbl.gov/EisenSoftware.html Cluster 3.0 : http://bonsai.ims.u-tokyo.ac.jp/~mdehoon/software/cluster/index.htm XCluster : http://genetics.stanford.edu/~sherlock/cluster.html There are cdt sample files in the data/sample/ directory of this distribution. Once you have a cdtfile, you can now use the makeMicroarrayDataset.pl program to create a dataset from it. Before you run it, you need to make sure of two things: i. The correlations binary that you installed must be in your path. ii. The location where you just installed the perl modules must be in your path, or edit makeMicroarrayDataset.pl to add a 'use lib' statement such that it can find them. There are several command-line options for makeMicroarrayDataset.pl (execute it with no parameters, and it will print out its usage information): makeMicroarrayDataset.pl -file <file/name> -name <intended/dataset/name> \ [-rootpath <your_root_directory> -contrast <float> -colorscheme <rg|yb> -verbose] -file = (required) input file (currently only '.cdt' files supported) -name : (required) dataset name to be created (may be delimited by slashes(/) to imply hierarchy) -rootpath : root directory, under which must exist html and data directories -contrast : (optional) contrast value for the generated images (defaults to 4, As the data are expected to be in log base 2, this corresponds to a 16-fold change as the maximum color in any image) -colorscheme : (optional) color scheme used for generating the images (rg = red/green, yb = yellow/blue ; defaults to yellow/blue) -verbose : (optional) whether show feedback messages during run (no value required - simply provide the parameter) -help : (optional) whether to print usage information (no value required - just provide the parameter) eg: makeMicroarrayDataset.pl -file mydata.cdt -name mydata -rootpath /<rootpath>/ -imageout -verbose -contrast 2 -colorscheme rg The program will create 7 files in: /<rootpath>/data/explorer/<name>/ with the following suffixes: .expt_info .feature_info, .data_matrix, .binCor, .meta (these files must be readable by a gx application running out of cgi-bin) and two files in: /<rootpath>/html/explorer/<name>/ with the following suffixes: .data_matrix.gif, .expt_info.gif In addition, the dataset creation will expect a tmp directory to exist at: /<rootpath>/html/tmp/ Once your dataset has been created, it should be usable by the gx application. 2. Move some files in the distribution into place a) The displayConfig files cp data/explorer/* /<rootpath>/data/explorer/ b) Some images cp html/explorer/images/* /<rootpath>/html/explorer/images/ 3. Use the gx application You will need to place the gx application in your cgi-bin directory, and make it executable. You will also need to make some modifications to the gx application so that it knows where certain things exist that are within your system: a) Libraries gx has to be able to find the Perl libraries that you installed with the 'make install' command above. If you installed them in a typical system wide fashion, then it is likely that they are in the path of the cgi script. If you installed them in a specific place, then you will need to add a : use lib '/home/your/private/dir/lib'; in gx, before the: use Microarray::CdtDataset; use Microarray::Explorer; use Microarray::Config; lines. If you installed them in a lib directory immediately below the server root, at the same level as your html directory, then simply uncomment the lines in gx that read: #use File::Basename; #use lib dirname($ENV{DOCUMENT_ROOT})."/lib"; which should locate the libraries for you. b) Paths You need to put into the gx file the your rooturl and rootpath so it knows where to find the files and link to. You also need to place the images that are provided in the html/explorer/images directory of this distribution into the <rootpath>/html/explorer/images/ directory on your server. To then actually use gx, you should be able to type in your web browser a url such as: http://<rooturl>/cgi-bin/gx?n=<name> where <name> is the name that you gave your dataset when creating it with makeMicroarrayDataset.pl. For creation of subsequent datasets, as long as you are using the same rootpath directory, then you should be able to not worry about this step 2, and after creating your dataset should be able to simply change the n parameter passed to gx, and it will just work.