skip to Main Content

As of 06/21/22, GSEA has replaced GSA: https://www.gsea-msigdb.org/gsea/index.jsp. The information found below is valid for the previous version of the pipeline effective 10/28/20 – 06/20/22:

The Bioinformatics Core pipeline performs Gene Set Analysis by utilizing the Piano Bioconductor package : See full description of the method and procedure

Description of the output from the GSA analysis. The first two columns (Gene Set Category and Name) stand for the category from the MSigDB and a specific gene set , followed by the total number of genes in the gene-set ‘Genes (tot)’. The column ‘Stat (dist.dir)’ contains the mean statistic for this gene-set computed for up-regulated genes. Total number of up-regulated genes in this gene-set is in the ‘Genes (UP)’ column .  Similarly, for down-regulated gene sets, the statistic in this column is computed using only genes that are down-regulated (‘Genes (DN)’).  Column ‘GenesInGeneSet’ lists genes used in the computations, and the column ‘GenesAndFC’ lists genes and their corresponding fold-changes.

p-values (column ‘p adj (dist.dir.up)’) are computed by gene permutations. By default, 500 permutations are done. To adjust for multiple hypotheses, the fourth column indicates the adjusted p-value, computed using the FDR approach. The last two columns list all genes in the gene set and their fold changes.

MSigDB gene sets analyzed:

H: hallmark gene sets
C2: curated gene sets
C3: regulatory target gene sets
C5: ontology gene sets
C6: oncogenic signature gene sets
C7: immunologic signature gene sets

See https://www.gsea-msigdb.org/gsea/msigdb/collections.jsp for collection details.

All gene sets are from MSigDB version 7.1.

Effective before 10/27/20:

Each “C*” is a gene set collection from Broad’s Molecular Signatures Database (MSigDB):

C1: positional gene sets – for each human chromosome and cytogenic band
C2: curated gene sets – from online pathway databases, publications in PubMed, and knowledge of domain experts
C3: motif gene sets – based on conserved cis-regulatory motifs from a comparative analysis of the human, mouse, rat and dog genomes
C4: computational gene sets – defined by mining large collections of cancer-oriented microarray data
C5: GO gene sets – consist of genes annotated by the same GO terms
C6: oncogenic signatures – defined directly from microarray gene expression data from cancer gene perturbations
C7: immunologic signatures – defined directly from microarray gene expression data from immunologic studies

This info was all pulled from here: http://software.broadinstitute.org/gsea/msigdb/index.jsp

Back To Top