FactBio has produced a series of commonly used, and high interest curated reference metadata sets for the life sciences community. These data collections cover a wide range of areas such as functional genomics, genomic variation and toxicology. Starting with functional genomics, specific source databases include EMBL-EBI ArrayExpress, Expression Atlas and NCBI GEO. Other source such as the European Variation archive, diXa, TOXNET are in the pipeline.

Users can access a range of highly specific datasets through the service, which includes GTEx, ENCODE, RIKEN FANTOM5, Genentech cancer cell lines, CCLE, PanCancer and Illumina BodyMap 2.0.

An example screenshot from the Cancer Cell Line Encyclopedia data set.

These curated reference datasets enable researchers to access the highest quality annotated metadata focused on human gene expression. By using FactBio's data curation and annotation platform, Kusp, coupled with proprietary algorithms, data can be curated and annotated rapidly and to a high standard, and kept up to date as reference databases and datasets are expanded.

Alongside the ability to access public data through a single and easy to use interface, users can also use Kusp to upload and annotate their own private data.

To find out more about the range of curated databases and datasets, contact us.

Databases and Datasets Available in the Collections


Functional Genomics

Genomic Variation (in development)

Toxicology (in development)