Priors (emmaa.priors
)¶
This module contains classes to generate prior networks.
- class emmaa.priors.SearchTerm(type, name, db_refs, search_term)[source]¶
Bases:
object
Represents a search term to be used in a model configuration.
- Parameters
type (str) – The type of search term, e.g. gene, bioprocess, other
name (str) – The name of the search term, is equivalent to an Agent name
db_refs (dict) – A dict of database references for the given term, is similar to an Agent db_refs dict
search_term (str) – The actual search term to us for searching PubMed
- emmaa.priors.get_drugs_for_gene(stmts, hgnc_id)[source]¶
Get list of drugs that target a gene
- Parameters
stmts (list of
indra.statements.Statement
) – List of INDRA statements with a drug as subjecthgnc_id (str) – HGNC id for a gene
- Returns
drugs_for_gene – List of search terms for drugs targeting the input gene
- Return type
list of
emmaa.priors.SearchTerm
Literature Prior (emmaa.priors.literature_prior
)¶
This module implements the LiteraturePrior class which automates some of the steps involved in starting a model around a set of literature searches. Example:
lp = LiteraturePrior('some_disease', 'Some Disease',
'This is a self-updating model of Some Disease',
search_strings=['some disease'],
assembly_config_template='nf')
estmts = lp.get_statements()
model = lp.make_model(estmts, upload_to_s3=True)
- emmaa.priors.literature_prior.get_raw_statements_for_pmids(pmids, mode='all', batch_size=100)[source]¶
Return EmmaaStatements based on extractions from given PMIDs.
- Parameters
pmids (set or list of str) – A set of PMIDs to find raw INDRA Statements for in the INDRA DB.
mode ('all' or 'distilled') – The ‘distilled’ mode makes sure that the “best”, non-redundant set of raw statements are found across potentially redundant text contents and reader versions. The ‘all’ mode doesn’t do such distillation but is significantly faster.
batch_size (Optional[int]) – Determines how many PMIDs to fetch statements for in each iteration. Default: 100.
- Returns
A dict keyed by PMID with values INDRA Statements obtained from the given PMID.
- Return type
TCGA Cancer Prior (emmaa.priors.cancer_prior
)¶
- class emmaa.priors.cancer_prior.TcgaCancerPrior(tcga_study_prefix, sif_prior, diffusion_service=None, mutation_cache=None)[source]¶
Bases:
object
Prior network generation using TCGA mutations for a given cancer type.
This class implements building a prior network using a generic underlying prior, and TCGA data for a specific cancer type. Mutations for the given cancer type are extracted from TCGA studies and heat diffusion from the corresponding nodes in the prior is used to identify a set of relevant nodes.
- get_relevant_nodes(pct_heat_threshold)[source]¶
Return a list of the relevant nodes in the prior.
Heat diffusion is applied to the prior network based on initial heat on nodes that are mutated according to patient statistics.
Gene List Prior (emmaa.priors.gene_list_prior
)¶
Reactome Prior (emmaa.priors.reactome_prior
)¶
- emmaa.priors.reactome_prior.find_drugs_for_genes(search_terms, drug_gene_stmts=None)[source]¶
Return list of drugs targeting at least one gene from a list of genes
- Parameters
search_terms (list of
emmaa.priors.SearchTerm
) – List of search terms for genes- Returns
drug_terms – List of search terms of drugs targeting at least one of the input genes
- Return type
list of
emmaa.priors.SearchTerm
- emmaa.priors.reactome_prior.get_genes_contained_in_pathway(reactome_id)[source]¶
Get all genes contained in a given pathway
- emmaa.priors.reactome_prior.get_pathways_containing_gene(reactome_id)[source]¶
“Get all ids for reactom pathways containing some form of an entity
- emmaa.priors.reactome_prior.make_prior_from_genes(gene_list)[source]¶
Return reactome prior based on a list of genes
- Parameters
- Returns
res – List of search terms corresponding to all genes found in any reactome pathway containing one of the genes in the input gene list
- Return type
list of
emmaa.priors.SearchTerm
Querying Prior Statements (emmaa.priors.prior_stmts
)¶
- emmaa.priors.prior_stmts.get_stmts_for_gene(gene, max_stmts=100000)[source]¶
Return all existing Statements for a given gene from the DB.