EMMAA’s Database (emmaa.db)

The Database Schema (emmaa.db.schema)

class emmaa.db.schema.Query(**kwargs)[source]

Bases: Base, QueriesDbTable

Queries run on each model: Query(_hash_, model_id, json, qtype)

The hash column is a hash generated from the json and model_id columns that can be derived from the

Parameters
  • hash (big-int) – (primary key) A 32 bit integer generated from the json and model_id.

  • model_id (str) – (20 character) The short id/acronym for the given model.

  • json (json) – A json dict containing the relevant parameters defining the query.

class emmaa.db.schema.Result(**kwargs)[source]

Bases: Base, QueriesDbTable

Results of queries to models:

Result(_id_, query_hash, date, result_json, mc_type, all_result_hashes, delta)

Parameters
  • id (int) – (auto, primary key) A database-assigned integer id.

  • query_hash (big-int) – (foreign key -> Query.hash) The hash of the query json, which can be directly generated.

  • date (datetime) – (auto) The date the result was entered into the database.

  • result_json (json) – A json dict containing the results for the query.

  • mc_type (str) – A name of a ModelChecker used to answer the query.

class emmaa.db.schema.Statement(**kwargs)[source]

Bases: Base, StatementsDbTable

Statements in the model:

Statement(_id_, model_id, date, statement_json)

Parameters
  • id (int) – (auto, primary key) A database-assigned integer id.

  • model_id (str) – (20 character) The short id/acronym for the given model.

  • stmt_hash (big-int) – The hash of the statement.

  • date (str) – The date when the model was assembled.

  • statement_json (json) –

class emmaa.db.schema.User(**kwargs)[source]

Bases: Base, QueriesDbTable

A table containing users of EMMAA: User(_id_, email)

Parameters
  • id (int) – (from indralab_auth_tools.src.models.User.id, primary key) A database-generated integer from the User table in indralab auth tools.

  • email (str) – The email of the user (must be unique)

class emmaa.db.schema.UserModel(**kwargs)[source]

Bases: Base, QueriesDbTable

A table linking users to models:

UserModel(_id_, user_id, model_id, date, subscription)

Parameters
  • id (int) – (auto, primary key) A database-assigned integer id.

  • user_id (int) – (foreign key -> User.id) The id of the user related to this query.

  • model_id (str) – (20 character) The short id/acronym for the given model.

  • date (datetime) – (auto) The date that this entry was added to the database.

  • subscription (bool) – Record whether the user has subscribed to see results of this model.

class emmaa.db.schema.UserQuery(**kwargs)[source]

Bases: Base, QueriesDbTable

A table linking users to queries:

UserQuery(_id_, user_id, query_hash, date, subscription, count)

Parameters
  • id (int) – (auto, primary key) A database-assigned integer id.

  • user_id (int) – (foreign key -> User.id) The id of the user related to this query.

  • query_hash (big-int) – (foreign key -> Query.hash) The hash of the query json, which can be directly generated.

  • date (datetime) – (auto) The date that this entry was added to the database.

  • subscription (bool) – Record whether the user has subscribed to see results of this model.

  • count (int) – Record the number of times the user associated with user id has done this query

Database Manager (emmaa.db.manager)

exception emmaa.db.manager.EmmaaDatabaseError[source]

Bases: Exception

class emmaa.db.manager.EmmaaDatabaseManager(host, label=None)[source]

Bases: object

A parent class used to manage sessions with in EMMAA’s databases.

create_tables(tables=None)[source]

Create the tables from the EMMAA database

Optionally specify tables to be created. List may contain either table objects or the string names of the tables.

drop_tables(tables=None, force=False)[source]

Drop the tables from the EMMAA database given in tables.

If tables is None, all tables will be dropped. Note that if force is False, a warning prompt will be raised to asking for confirmation, as this action will remove all data from that table.

class emmaa.db.manager.QueryDatabaseManager(host, label=None)[source]

Bases: EmmaaDatabaseManager

A class used to manage sessions with EMMAA’s query database.

add_user(user_id, email)[source]

Add a new user’s email and id to Emmaa’s User table.

get_all_result_hashes(qhash, mc_type)[source]

Get a set of all result hashes for a given query and mc_type.

get_model_users(model_id)[source]

Get all users who are subscribed to a given model.

Parameters

model_id (str) – A standard name of a model to get users for.

Returns

A list of email addresses corresponding to all users who are subscribed to this model.

Return type

list[str]

get_queries(model_id)[source]

Get queries that refer to the given model_id.

Parameters

model_id (str) – The short, standard model ID.

Returns

queries – A list of queries retrieved from the database.

Return type

list[emmaa.queries.Query]

get_results(user_email, latest_order=1, query_type=None)[source]

Get the results for which the user has registered.

Parameters
  • user_email (str) – The email of a user.

  • latest_order (int) – Which result in the order from the latest to get. Default: 1 ( latest).

  • query_type (str) – Filter results to specific query type. Default: None (all query types will be returned).

Returns

results – A list of tuples, each of the form: (model_id, query, mc_type, result_json, delta, date) representing the result of a query run on a model on a given date.

Return type

list[tuple]

get_subscribed_queries(email)[source]

Get a list of (query object, model id, query hash) for a user

Parameters

email (str) – The email address to check subscribed queries for

Return type

list(tuple(emmaa.queries.Query, str, query_hash))

get_subscribed_users()[source]

Get all users who have subscriptions :returns: A list of email addresses corresponding to all users who have

any subscribed query

Return type

list[str]

get_user_models(email)[source]

Get all models a user is subscribed to.

put_queries(user_email, user_id, query, model_ids, subscribe=True)[source]

Add queries to the database for a given user.

Parameters
  • user_email (str) – the email of the user that entered the queries.

  • user_id (int) – the user id of the user that entered the queries. Corresponds to the user id in the User table in indralab_auth_tools

  • query (emmaa.queries.Query) – A query object containing all necessary information.

  • model_ids (list[str]) – A list of the short, standard model IDs to which the user wishes to apply these queries.

  • subscribe (bool) – True if the user wishes to subscribe to this query.

put_results(model_id, query_results)[source]

Add new results for a set of queries tested on a model_id.

Parameters
  • model_id (str) – The short, standard model ID.

  • query_results (list of tuples) – A list of tuples of the form (query, mc_type, result_json), where the query is the query object run against the model, mc_type is the model type for the result, and the result_json is the json containing corresponding result.

subscribe_to_model(user_email, user_id, model_id)[source]

Subsribe a user to model updates.

Parameters
  • user_email (str) – the email of the user that entered the queries.

  • user_id (int) – the user id of the user that entered the queries. Corresponds to the user id in the User table in indralab_auth_tools

  • model_id (str) – Standard model ID to which the user wishes to subscribe.

update_email_subscription(email, queries, models, subscribe)[source]

Update email subscriptions for user queries

NOTE: For now this method simply unsubscribes to the given queries but should in the future differentiated into recieving email notifications or not and subscribing to queries or not.

Parameters
  • email (str) – The email assocaited with the query

  • queries (list(int)) – A list of query hashes.

  • list[str] (models ") – A list of models.

  • subscribe (bool) – The subscription status for all matching query hashes

Returns

Return True if the update was successful, False otherwise

Return type

bool

class emmaa.db.manager.StatementDatabaseManager(host, label=None)[source]

Bases: EmmaaDatabaseManager

A class used to manage sessions with EMMAA’s query database.

add_model_from_s3(model_id, config=None, number_of_updates=3, bucket='emmaa')[source]

Add data for one model from S3 files.

add_statements(model_id, date, stmt_jsons, max_updates=3)[source]

Add statements to the database.

Parameters
  • model_id (str) – The standard name of the model to add statements to.

  • date (str) – The date when the model was generated.

  • stmt_jsons (list[dict]) – A list of statement JSONs to add to the database.

  • max_updates (int) – The maximum number of model states to keep in the database. If it is reached, the oldest model state will be deleted.

Returns

True if the statements were added successfully, False otherwise.

Return type

bool

build_from_s3(number_of_updates=3, bucket='emmaa')[source]

Build the database from S3 files. NOTE: This deletes existing database entries and repopulates the tables.

delete_statements(model_id, date)[source]

Delete statements from the database.

get_latest_date(model_id)[source]

Get the oldest date this model is available for.

Parameters

model_id (str) – The standard name of the model.

Returns

The oldest date this model is available for.

Return type

str

get_number_of_dates(model_id)[source]

Get the number of unique dates this model is available for.

Parameters

model_id (str) – The standard name of the model.

Returns

The number of unique dates this model is available for.

Return type

int

get_number_of_statements(model_id, date=None)[source]

Get the number of statements in a model.

Parameters
  • model_id (str) – The standard name of the model.

  • date (str) – The date when the model was generated.

Returns

The number of statements in the model.

Return type

int

get_oldest_date(model_id)[source]

Get the oldest date this model is available for.

Parameters

model_id (str) – The standard name of the model.

Returns

The oldest date this model is available for.

Return type

str

get_path_counts(model_id, date=None)[source]

Get the path counts for statements.

Parameters
  • model_id (str) – The standard name of the model.

  • date (str) – The date when the model was generated.

Returns

A dictionary mapping statement hashes to the number of times they were used in the paths.

Return type

dict[str, int]

get_statements(model_id, date=None, offset=0, limit=None, sort_by=None, stmt_types=None, min_belief=None, max_belief=None)[source]

Load the statements by model and date.

Parameters
  • model_id (str) – The standard name of the model to get statements for.

  • date (str) – The date when the model was generated.

  • offset (int) – The offset to start at.

  • limit (int) – The number of statements to return.

Returns

A list of statements corresponding to the model and date.

Return type

list[indra.statements.Statement]

get_statements_by_hash(model_id, date, stmt_hashes)[source]

Get statements by hash.

Parameters
  • model_id (str) – The standard name of the model to get statements for.

  • date (str) – The date when the model was generated.

  • stmt_hashes (list[str]) – A list of statement hashes to get statements for.

Returns

A list of statements corresponding to the model and date.

Return type

list[indra.statements.Statement]

update_statements_path_counts(model_id, date, path_counts)[source]

Update the path counts for statements. The update is incremental because we can have the statement used in the paths in different test corpora.

Parameters
  • model_id (str) – The standard name of the model.

  • date (str) – The date when the model was generated.

  • path_counts (dict[int, int]) – A dictionary mapping statement hashes to the number of times they were used in the paths.