Priors (emmaa.priors)

This module contains classes to generate prior networks.

class emmaa.priors.SearchTerm(type, name, db_refs, search_term)[source]

Bases: object

Represents a search term to be used in a model configuration.

  • type (str) – The type of search term, e.g. gene, bioprocess, other
  • name (str) – The name of the search term, is equivalent to an Agent name
  • db_refs (dict) – A dict of database references for the given term, is similar to an Agent db_refs dict
  • search_term (str) – The actual search term to us for searching PubMed
classmethod from_json(jd)[source]

Return a SearchTerm object from JSON.


Return search term as JSON.

emmaa.priors.get_drugs_for_gene(stmts, hgnc_id)[source]

Get list of drugs that target a gene

  • stmts (list of indra.statements.Statement) – List of INDRA statements with a drug as subject
  • hgnc_id (str) – HGNC id for a gene

drugs_for_gene – List of search terms for drugs targeting the input gene

Return type:

list of emmaa.priors.SearchTerm

TCGA Cancer Prior (emmaa.priors.cancer_prior)

class emmaa.priors.cancer_prior.TcgaCancerPrior(tcga_study_prefix, sif_prior, diffusion_service=None, mutation_cache=None)[source]

Bases: object

Prior network generation using TCGA mutations for a given cancer type.

This class implements building a prior network using a generic underlying prior, and TCGA data for a specific cancer type. Mutations for the given cancer type are extracted from TCGA studies and heat diffusion from the corresponding nodes in the prior is used to identify a set of relevant nodes.

static find_drugs_for_genes(node_list)[source]

Return list of drugs targeting gene nodes.


Return dict of gene mutation frequencies based on TCGA studies.


Return a list of the relevant nodes in the prior.

Heat diffusion is applied to the prior network based on initial heat on nodes that are mutated according to patient statistics.

load_sif_prior(fname, e50=20)[source]

Return a Graph based on a SIF file describing a prior.

  • fname (str) – Path to the SIF file.
  • e50 (int) – Parameter for converting evidence counts into weights over the interval [0, 1) according to hyperbolic function weight = (count / (count + e50)).

Run the prior node list generation and return relevant nodes.

static search_terms_from_nodes(node_list)[source]

Build a list of Pubmed search terms from the nodes returned by make_prior.

Gene List Prior (emmaa.priors.gene_list_prior)

Reactome Prior (emmaa.priors.reactome_prior)


Return list of drugs targeting at least one gene from a list of genes

Parameters:search_terms (list of emmaa.priors.SearchTerm) – List of search terms for genes
Returns:drug_terms – List of search terms of drugs targeting at least one of the input genes
Return type:list of emmaa.priors.SearchTerm

Get all genes contained in a given pathway

Parameters:reactome_id (str) – Reactome id for a pathway
Returns:genes – List of uniprot ids for all unique genes contained in input pathway
Return type:list of str

“Get all ids for reactom pathways containing some form of an entity

Parameters:reactome_id (str) – Reactome id for a gene
Returns:pathway_ids – List of reactome ids for pathways containing the input gene
Return type:list of str

Return reactome prior based on a list of genes

Parameters:gene_list (list of str) – List of HGNC symbols for genes
Returns:res – List of search terms corresponding to all genes found in any reactome pathway containing one of the genes in the input gene list
Return type:list of emmaa.priors.SearchTerm

Return the Reactome Stable IDs for a given Uniprot ID.


Get the Uniprot ID (referenceEntity) for a given Reactome Stable ID.

Querying Prior Statements (emmaa.priors.prior_stmts)


Return all existing Statements for a given gene from the DB.

Parameters:gene (str) – The HGNC symbol of a gene to query.
Returns:A list of INDRA Statements in which the given gene is involved.
Return type:list[indra.statements.Statement]
emmaa.priors.prior_stmts.get_stmts_for_gene_list(gene_list, other_entities)[source]

Return all Statements between genes in a given list.

  • gene_list (list[str]) – A list of HGNC symbols for genes to query.
  • other_entities (list[str]) – A list of other entities to keep as part of the set of Statements.

A list of INDRA Statements between the given list of genes and other entities specified.

Return type: