Analyze model test results (emmaa.analyze_tests_results
)¶
- class emmaa.analyze_tests_results.ModelRound(statements, date_str, paper_ids=None, paper_id_type='TRID', emmaa_statements=None)[source]¶
Bases:
Round
Analyzes the results of one model update round.
- Parameters
statements (list[indra.statements.Statement]) – A list of INDRA Statements used to assemble a model.
date_str (str) – Time when ModelManager responsible for this round was created.
paper_ids (list(str)) – A list of paper IDs used to get raw statements for this round.
paper_id_type (str) – Type of paper ID used.
- stmts_by_papers¶
A dictionary mapping the paper IDs to sets of hashes of assembled statements with evidences retrieved from these papers.
- Type
- get_agent_distribution()[source]¶
Return a sorted list of tuples containing an agent name and a number of times this agent occured in statements of a model.
- get_assembled_stmts_by_paper(id_type='TRID')[source]¶
Get a mapping of paper IDs (TRID or PII) to assembled statements.
- get_english_statements_by_hash()[source]¶
Return a dictionary mapping a statement and its English description.
- get_papers_distribution()[source]¶
Return a sorted list of tuples containing a paper ID and a number of unique statements extracted from that paper.
- get_statement_types()[source]¶
Return a sorted list of tuples containing a statement type and a number of times a statement of this type occured in a model.
- class emmaa.analyze_tests_results.ModelStatsGenerator(model_name, latest_round=None, previous_round=None, previous_json_stats=None, bucket='emmaa')[source]¶
Bases:
StatsGenerator
Generates statistic for a given model update round.
- Parameters
model_name (str) – A name of a model the tests were run against.
latest_round (emmaa.analyze_tests_results.ModelRound) – An instance of a ModelRound to generate statistics for. If not given, will be generated by loading model data from s3.
previous_round (emmaa.analyze_tests_results.ModelRound) – A different instance of a ModelRound to find delta between two rounds. If not given, will be generated by loading model data from s3.
previous_json_stats (list[dict]) – A JSON-formatted dictionary containing model statistics for previous update round.
- class emmaa.analyze_tests_results.Round(date_str)[source]¶
Bases:
object
Parent class for classes analyzing one round of something (model or tests).
- Parameters
date_str (str) – Time when ModelManager responsible for this round was created.
- function_mapping¶
A dictionary of strings mapping a type of content to a tuple of functions necessary to find delta for this type of content. First function in a tuple gets a list of all hashes for a given content type, while the second returns an English description of a given content type for a single hash.
- Type
- find_delta_hashes(other_round, content_type, **kwargs)[source]¶
Return a dictionary of changed hashes of a given content type. This method makes use of self.function_mapping dictionary.
- Parameters
other_round (emmaa.analyze_tests_results.TestRound) – A different instance of a TestRound
content_type (str) – A type of the content to find delta. Accepted values: - statements - applied_tests - passed_tests - paths
**kwargs (dict) – For some of content types, additional arguments must be provided sych as mc_type.
- Returns
hashes – A dictionary containing lists of added and removed hashes of a given content type between two test rounds.
- Return type
- class emmaa.analyze_tests_results.StatsGenerator(model_name, latest_round=None, previous_round=None, previous_json_stats=None, bucket='emmaa')[source]¶
Bases:
object
Parent class for classes generating statistic for a given round of tests or model update.
- Parameters
model_name (str) – A name of a model the tests were run against.
latest_round (ModelRound or TestRound or None) – An instance of a ModelRound or TestRound to generate statistics for. If not given, will be generated by loading json from s3.
previous_round (ModelRound or TestRound or None) – A different instance of a ModelRound or TestRound to find delta between two rounds. If not given, will be generated by loading json from s3.
previous_json_stats (dict) – A JSON-formatted dictionary containing model or test statistics for the previous round.
- class emmaa.analyze_tests_results.TestRound(json_results, date_str)[source]¶
Bases:
Round
Analyzes the results of one test round.
- Parameters
json_results (list[dict]) – A list of JSON formatted dictionaries to store information about the test results. The first dictionary contains information about the model. Each consecutive dictionary contains information about a single test applied to the model and test results.
date_str (str) – Time when ModelManager responsible for this round was created.
- mc_types_results¶
A dictionary mapping a type of a ModelChecker to a list of test results generated by this ModelChecker
- Type
- english_test_results¶
A dictionary mapping a test hash and a list containing its English description, result in Pass/Fail/n_a form and either a path if it was found or a result code if it was not.
- Type
- class emmaa.analyze_tests_results.TestStatsGenerator(model_name, test_corpus_str='large_corpus_tests', latest_round=None, previous_round=None, previous_json_stats=None, bucket='emmaa')[source]¶
Bases:
StatsGenerator
Generates statistic for a given test round.
- Parameters
model_name (str) – A name of a model the tests were run against.
test_corpus_str (str) – A name of a test corpus the model was tested against.
latest_round (emmaa.analyze_tests_results.TestRound) – An instance of a TestRound to generate statistics for. If not given, will be generated by loading test results from s3.
previous_round (emmaa.analyze_tests_results.TestRound) – A different instance of a TestRound to find delta between two rounds. If not given, will be generated by loading test results from s3.
previous_json_stats (list[dict]) – A JSON-formatted dictionary containing test statistics for previous test round.