Priors (emmaa.priors
)¶
This module contains classes to generate prior networks.
-
class
emmaa.priors.
SearchTerm
(type, name, db_refs, search_term)[source]¶ Bases:
object
Represents a search term to be used in a model configuration.
Parameters: - type (str) – The type of search term, e.g. gene, bioprocess, other
- name (str) – The name of the search term, is equivalent to an Agent name
- db_refs (dict) – A dict of database references for the given term, is similar to an Agent db_refs dict
- search_term (str) – The actual search term to us for searching PubMed
-
emmaa.priors.
get_drugs_for_gene
(stmts, hgnc_id)[source]¶ Get list of drugs that target a gene
Parameters: - stmts (list of
indra.statements.Statement
) – List of INDRA statements with a drug as subject - hgnc_id (str) – HGNC id for a gene
Returns: drugs_for_gene – List of search terms for drugs targeting the input gene
Return type: list of
emmaa.priors.SearchTerm
- stmts (list of
Literature Prior (emmaa.priors.literature_prior
)¶
This module implements the LiteraturePrior class which automates some of the steps involved in starting a model around a set of literature searches. Example:
lp = LiteraturePrior('some_disease', 'Some Disease',
'This is a self-updating model of Some Disease',
search_strings=['some disease'],
assembly_config_template='nf')
estmts = lp.get_statements()
model = lp.make_model(estmts, upload_to_s3=True)
-
emmaa.priors.literature_prior.
get_raw_statements_for_pmids
(pmids, mode='all', batch_size=100)[source]¶ Return EmmaaStatements based on extractions from given PMIDs.
Parameters: - pmids (set or list of str) – A set of PMIDs to find raw INDRA Statements for in the INDRA DB.
- mode ('all' or 'distilled') – The ‘distilled’ mode makes sure that the “best”, non-redundant set of raw statements are found across potentially redundant text contents and reader versions. The ‘all’ mode doesn’t do such distillation but is significantly faster.
- batch_size (Optional[int]) – Determines how many PMIDs to fetch statements for in each iteration. Default: 100.
Returns: A dict keyed by PMID with values INDRA Statements obtained from the given PMID.
Return type:
-
emmaa.priors.literature_prior.
make_search_terms
(search_strings, mesh_ids)[source]¶ Return EMMAA SearchTerms based on search strings and MeSH IDs.
Parameters: - search_strings (list of str) – A list of search strings e.g., “diabetes” to find papers in the literature.
- mesh_ids (list of str) – A list of MeSH IDs that are used to search the literature as headings associated with papers.
Returns: A list of EMMAA SearchTerm objects constructed from the search strings and the MeSH IDs.
Return type: list of emmmaa.prior.SearchTerm
TCGA Cancer Prior (emmaa.priors.cancer_prior
)¶
-
class
emmaa.priors.cancer_prior.
TcgaCancerPrior
(tcga_study_prefix, sif_prior, diffusion_service=None, mutation_cache=None)[source]¶ Bases:
object
Prior network generation using TCGA mutations for a given cancer type.
This class implements building a prior network using a generic underlying prior, and TCGA data for a specific cancer type. Mutations for the given cancer type are extracted from TCGA studies and heat diffusion from the corresponding nodes in the prior is used to identify a set of relevant nodes.
-
get_relevant_nodes
(pct_heat_threshold)[source]¶ Return a list of the relevant nodes in the prior.
Heat diffusion is applied to the prior network based on initial heat on nodes that are mutated according to patient statistics.
-
load_sif_prior
(fname, e50=20)[source]¶ Return a Graph based on a SIF file describing a prior.
Parameters:
-
Gene List Prior (emmaa.priors.gene_list_prior
)¶
Reactome Prior (emmaa.priors.reactome_prior
)¶
-
emmaa.priors.reactome_prior.
find_drugs_for_genes
(search_terms, drug_gene_stmts=None)[source]¶ Return list of drugs targeting at least one gene from a list of genes
Parameters: search_terms (list of emmaa.priors.SearchTerm
) – List of search terms for genesReturns: drug_terms – List of search terms of drugs targeting at least one of the input genes Return type: list of emmaa.priors.SearchTerm
-
emmaa.priors.reactome_prior.
get_genes_contained_in_pathway
[source]¶ Get all genes contained in a given pathway
Parameters: reactome_id (str) – Reactome id for a pathway Returns: genes – List of uniprot ids for all unique genes contained in input pathway Return type: list of str
-
emmaa.priors.reactome_prior.
get_pathways_containing_gene
[source]¶ “Get all ids for reactom pathways containing some form of an entity
Parameters: reactome_id (str) – Reactome id for a gene Returns: pathway_ids – List of reactome ids for pathways containing the input gene Return type: list of str
-
emmaa.priors.reactome_prior.
make_prior_from_genes
(gene_list)[source]¶ Return reactome prior based on a list of genes
Parameters: gene_list (list of str) – List of HGNC symbols for genes Returns: res – List of search terms corresponding to all genes found in any reactome pathway containing one of the genes in the input gene list Return type: list of emmaa.priors.SearchTerm
Querying Prior Statements (emmaa.priors.prior_stmts
)¶
-
emmaa.priors.prior_stmts.
get_stmts_for_gene
(gene)[source]¶ Return all existing Statements for a given gene from the DB.
Parameters: gene (str) – The HGNC symbol of a gene to query. Returns: A list of INDRA Statements in which the given gene is involved. Return type: list[indra.statements.Statement]