header image left image
  Current Projects  
  Publications  
  Theses  
  Computer Facilities  
divider   Home   divider
divider   Imprint   divider   Sitemap   divider   Intranet   divider   Contact   divider   Links   divider
THALLINGER LAB
tug logo igb logo
FREQUENTLY ASKED QUESTIONS


GENERAL

(1) Where can I get more information about distance measurements and clustering algorithms used in Genesis?
More information can be found in the master thesis of Alexander Sturn: "Cluster Analysis for Large Scale Gene Expression Studies". It can be found here.
(2) How do I install Genesis?
Click the download tab and follow the installation instructions. Don't forget to request a license from the "License" tab above.
(3) I have obtained the license key for Genesis. Where do I have to enter the key?
Download and install Genesis from our homepage. Start Genesis and open the license dialog window via the menu item File->License.... Press the "Re-license Application" button and enter the license key. Press "Register Application". The expiration date in the license dialog will disappear.
(4) What operating systems shall be used?
Genesis is currently developed for Windows 7 and up. It works very well under XP/Vista/7/8/10 systems and we recommend to use Windows 7 or above. It also works very well under Linux and Splaris, since these platforms have a well supported Java and Java3D environment. However, Genesis runs also under Mac OS X, Tru64 Unix, Irix and any other Java supported platform. Unfortunately not all of these platforms provide Java3D support. For more information see Java Platform Ports and Java 3D API on Other Platforms.
(5) Does Genesis work on Mac OS systems?
Genesis has been tested on Mac OS 10.6 through 10.9, but should also work for any other OS X version that supports Java 8. The Genesis installer package comes with Java. However, if you want to run it with a different Java version, please make sure that you are using the latest version since there have been some problems concerning off-screen rendering in older Java versions which lead to very slow rendering of expression images.
If you receive "Genesis Installer is damaged ..." or "Genesis Installer is from an unidentified developer ..." messages, reconfigure Gatekeeper (http://support.apple.com/kb/HT5290) to accept applications downloaded from "anywhere".
(6) How can I change the memory settings of Genesis?
Genesis is currently set to 1 GBytes of RAM. However, it is quite simple to change the setting.
If you have installed Genesis using install4j
Open the file Genesis.vmoptions in the Genesis home directory and add:
-Xmx1024m Where 1024 represents the number of megabytes you would like to assign to the Java VM
Unix, Linux, Windows using Genesis launch script (e.g. Genesis.bat)
Open the Genesis.bat (for Windows), GenesisUnix (Linux, Unix) or GenesisMacOSX (Mac OSX) file and change the setting -XmxXXXm, where XXX stands for the available memory for the Java VM in MBytes (e.g. 1GByte would be -Xmx1024m). Start Genesis via this files.
The maximum value is dependent on your system, but it will probably be around 1600000000 (higher on 64bit machines). If you set this parameter to a value above the available physical memory or your system, Java will start to swap memory onto the hard disk if needed. It is possible to calculate with swapping, however the performance will be very bad.
(7) How do I get a list of genes of a specific cluster? How can I save clusters?
Go to a cluster (e.g. expression image) and click with the right mouse button onto the panel (the right one) where the clustering information is displayed. A popup menu will appear and you can select "Save cluster..." or "Save all clusters..." . This will save the cluster as Stanford flat-file, which can be edited using any spreadsheet program, e.g. Microsoft Excel. For more information see the Genesis Manual.
(8) Are you working on newer version for more functions?
Genesis is permanently in development. Updates depend on the number and urgency of feature requests and bugs. Bugs are removed as soon as possible and new features will be added.
(9) What data format does Genesis use? What do you mean by Stanford format?
It is the file format used by Michael Eisens Cluster program. Basically it is quite simple and you can find a description in the Genesis Operation Manual.
(10) How do I display the gene names beside the expression image.
The solution to this problem is quite simple. Gene names are displayed, if the size of gene cells (gene height) is 10 pixels or more. Otherwise it makes no sense, since the font is unreadable below 10 pixels. Just use for instance the "Detail" view (View->Detail) and gene names will be displayed. If they are visible, they can be switched on or off via the View->View Gene Names menu option.
(11) Genesis does not start any more (even after a new installation). What has to be done?
Since release 1.4.0 Genesis stores the property files in the user home directory (.genesis). Delete this directory (.genesis) and start Genesis.
On Windows XP systems the folder is: C:\Documents and Settings\username
The folder is hidden (according to the windows specs)! To display hidden files use the explorer: Tools -> Folder Options -> View -> Show hidden files and folders.
Genesis should recreate this directory and work properly.
(12) How much memory do I need for a whole genome analysis? Or: I am running out of memory. Why?
Hierarchical clustering requires O(n^2) memory, where n is the number of genes (To be exact: n*(n-1)*2 Bytes). For example, given 45.000 genes this adds up to roughly 3.9 GB.
In order to cluster all those genes at once you will have to find a computer with a 64 bit operating system (Windows 2003 Server, Windows Vista/7 64 Bit, Linux or MacOS X on a 64-bit CPU) and run Genesis from there, with the increased memory settings from this FAQ.
Alternatively it often helps to filter those genes from your dataset, which do not change much across your samples (and would therefore not contribute to the clustering results anyway)
(13) Can I run Genesis on a server without graphical user interface?
Genesis 1.7.6 introduced a text based interface that can be used on systems without a graphical user interface. So far only Hirachical clustering is supported, but we will add more functionality to it in later releases.
To run Genesis from the command line navigate to the installation directory (or add the installation directory to your PATH variable) and then run Genesis/GenesisUnix/GenesisMacOSX (depending on your platform) -c hcl
You will then see the usage information for the HCL command. You need to provide at least the following parameters: "-i <inputfiles>" to define the input files, "-g" or "-e" to enable gene and/or experiment clustering, and "-p <file>", "-o <file>" and/or "-f <file>" to specify the output.

K-MEANS CLUSTERING (KMC)

(1) I'm using the distance function pearson squared. Why are some next nearest neighbor ratios greater than 1?
The k-Means clustering algorithm using the distance function pearson squared may not be suitable to deliver 100% accurate results. This is due to the nature of the distance function.

PRINCIPAL COMPONENT ANALYSIS (PCA)

(1) Although I downloaded Java 3D and installed it, Genesis still says 'not installed'.
This can occur if more than one java version, or an own java version for Genesis (bundled version) has been installed. In this case Genesis may not have access to the installed Java3D. De-install Genesis and install it new according to the description. Important is, that the Java3D is added to the Java, which is used by Genesis.
(2) Why are the Eigenvalues and Eigenvectors from PCAG (PCA form genes) and PCAE (PCA from experiments) the same?
To compute the principal components, the m Eigenvalues and their corresponding Eigenvectors are calculated from the (m x m) distance matrix using Singular Value Decomposition (SVD). m = number of experiments. n = number o genes. Genesis uses this method for both cases (genes AND experiments)! See also answer to question (3).
(3) PCAE (PCA from experiments): If I load a data set with 30 genes and 5 conditions, I expect to see 30 components with 30 dimensions each and not only 5 components. Furthermore, the Eigenvalues should be different from those that I get with PCAG (PCA form genes).
This is correct, and the straightforward implementation did exactly calculate the SVD of the (n x n) distance matrix. In this case we get the expected n components, where n is the number of genes. However, the disadvantage of this method is, that it is very computational intensive, since n is usually quite large.
And here is the trick. It is due to a mathematical tick possible to calculate the result using the small (m x m, m = number of experiments) distance matrix and than calculate with the SVD of this matrix back to the required solution (PCAE).
The results of this shortcut are the same as using the straightforward method, however since we are calculating the SVD also form the small matrix, you have of course the same Eigenvalues as calculating PCA from genes.
The gain is remarkable. Lets say we have 10000 genes and 100 experiments. In this case you have to calculate the SVD of a 100x100 distance matrix (5000 elements) instead of a 10000x10000 matrix (50.000.000 elements), or in other words: 1 sec instead of a couple of hours!
The drawback is that you will not get the Eigenvalues of the nxn matrix, but these are usually not required.
If you still want to calculate the exact PCAE you can do that by clicking "Analysis", "Calculate PCA experiments exact" or by holding the Shift key when you click the PCAE button!
(4) I compared Genesis' PCA calculations with those done other programs (both with covariance as distance metrics). Although, PCAG results are very similar (Genesis switches some (!) signs in the coefficients of the Eigenvectors - is that ok?)
Genesis is using Singular Value Decomposition to calculate the Eigenvectors. When the distance matrix is nonsingular, all latent roots (Eigenvalues) are strictly positive and each Eigenvector defines a principal component. Maybe they use a different procedure and therefore the Eigenvectors are slightly different. Remember, there is an infinitive number of solutions!
(5) When performing an operation which requires 3D rendering I receive the following error message: java.lang.IllegalStateException: GL_VERSION
This exception is displayed when the OpenGL driver (the 3D rendering engine) provided by your graphics is either not present or has a too low version number. Java 3D 1.6 (which is included in the latest Genesis release) requires an OpenGL driver version 1.3
There are two possibilities to get the 3D display to work:
- Install the latest driver of your graphics card and run Genesis again
- Install the latest version of DirectX (>= 9.0) and change the 3D display of Genesis to DirectX by adding the line "lax.nl.java.option.additional=-Dj3d.rend=d3d" to the file Genesis.lax in the Genesis install folder.
(6) When performing an operation which requires 3D rendering I receive the following error message: java.lang.IllegalArgumentException: adding a container to a container on a different GraphicsDevice
This error occurs in conjuction with a dual monitor configuration. There is a temporary fix for it:
   http://genome.tugraz.at/genesisclient/1.7.6-20131017/Genesis.jar
Close Genesis, copy the jar file into the installation directory of Genesis and start Genesis again.

SUPPORT VECTOR MACHINES (SVM)

(1) What file formats are used by SVM?.
*.svc files contain the classification of known genes (positive and negative examples). *.svm files contain the SVMs.
(2) How does SVM work?
Here is a (very) short tutorial how SVM is used:
Open the file fibroblasts.txt. Press on the SVM button (left one for training) A dialog appears Choose the FibroblastClass.svc file as classification. Press OK. The SVM will be trained. Save the SVM somewhere and give it a name, e.g. test.svm
Now you can classify a dataset with the same dimension, e.g. fibroblasts.txt again:
Press the right SVM Button for classification Select the SVM (test.svm) You will get the classification of genes based on the SVM.
Classification 1 represents a positive example, -1 a negative.

CHROMOSOMAL MAPPING

(1) Is it possible to map to the fragments available at the NCBI, or even better whole chromosomes?
It is possible. Just download any *.gbk or *.gbs file from the NCBI ftp-site. Simply click with the right mouse button onto e.g. the GenBank leaf or any subfolder to see the available options for adding information. We have tried several human, yeast, and c. elegans chromosomes. Be sure to have the latest release of Genesis (1.0.1), because only this release is able to import the latest files from NCBI. This version also supports *.gbs files (just the header information, without the sequence).
(2) Which fields does Genesis use in the genebank files to match gene names?
Genesis uses the "gene" field from the feature table.

GO MAPPING

(1) How do I create a GO-Mapping file
Create a text file, which contains only the IDs of your gene expression file, i.e. the first column of the gene expression file. Go to the Stanford SOURCE database and use the Batch SOURCE. Specify your created file, the type of input identifier and the organism. Use the Gene Ontology Annotations (short) as field for extraction. If the short option does not work (may occure) use the Gene Ontology Annotations (full). Submit the data and save the result. This is your GO-Mapping, which can be imported into genesis via GO->Import Go mapping\85
(2) How do I map Gene Expression Data onto GO
Open a gene expression file (e.g. GOSample.txt in the Sample directory) Import a GO-Mapping file via the menu entry GO->Import GO mapping... (e.g. GOMapping.txt in the Sample directory) Click onto the Gene Ontology button (lower left corner of Genesis) Click onto All Available Genes (lower left corner of Genesis) to map all genes or click onto a cluster to map only the genes included in the cluster. Click onto any node of the GO-Tree (upper left corner of Genesis) to see the mapping
(3) When trying to run GO-Mapping I recieve an error, telling me the Connection was refused, or Genesis hangs and stops responding.
Genesis tries to connect to an instance of the GO database at jdbc:mysql://discover.nci.nih.gov:1521/GEEVS. Connecting is either prevented by a firewall on your computer, by a firewall in your network, which causes the exception or by an old version of Genesis (Version 1.7.3 is required).
Please talk to your system administrator to enable the connection to the database.
If you continue to experience Problems, you could try an alternative GO Database server. To change to an another server, in Genesis click GO -> Database properties, and select one of the servers from the list, or select "Custom" and define a new connection.
If everything fails you can set up a local instance of the GO database on your system (or a system in your network). For instructions please visit:
  http://discover.nci.nih.gov/gominer/mysqlinstall.jsp
  http://www.discover.nci.nih.gov/gominer/enhance.jsp
 
faq tab
DESCRIPTION NEWS DOCUMENTATION FAQ LICENSE DOWNLOAD