MiST2 logo

Searching MiST2

Compared to the original MiST database, MiST2 provides many additional and more powerful search capabilities. All searches are performed using the search panel (depending on the web browser, it may not appear exactly as shown below):

Search panel

Searching simply involves selecting the 'Search' field, typing in your query, selecting the number of results to 'Show' per page, optionally defining the 'Scope', and pushing the 'Go' button. Unless you are viewing a specific genome summary page, the scope is set to search 'All genomes' in MiST2. While this is the broadest search possible, be aware that such searches will also require the most time. If you do not need to search all the genomes, it is possible to speed up your queries by narrowing your scope, which may be achieved in one of two ways. 1) Navigate to the summary page of a particular genome, make sure the the organism name is selected in the 'Scope' drop down box, and then submit your query. Or 2) construct a custom genome filter.

The search panel is accessible from every page in MiST2 and provides several options for finding relevant information. All searches are case-insensitive. The available search options are divided into three groups:

Genome - search for a particular genome. Incomplete words are acceptable and separate multiple terms with spaces.

  • Genome name: Find genomes by their source organism name. Examples: "coli izobium", "aeruginosa"
  • Taxonomy: Find genomes by their taxonomic designation. Searches the entire taxonomy from their kingdom to species. Example: "tenericutes"

Protein/gene - Find proteins by querying for a particular identifier or annotation. Multiple search terms should be separated with spaces.

Domain architecture - Find proteins by querying their domain architecture. Multiple search terms should be separated with spaces. Domain names prefixed with a '+' or '-' indicate that they must be present or absent, respectively.

  • Pfam: Pfam 23 domain library.

    Examples (constrained to Pseudomonas aeruginosa PA01

    • "Response_reg GGDEF" - all proteins with either a Response_reg OR a GGDEF domain

    • "+Response_reg +GGDEF" - all proteins with a Response_reg AND a GGDEF domain

    • "Response_reg +GGDEF" - all proteins with a GGDEF AND/OR a Response_reg domain

    • "+Response_reg GGDEF" - all proteins with a Response_reg AND/OR a GGDEF domain

    • "+Response_reg -GGDEF" - all proteins with a Response_reg AND NOT a GGDEF domain

    • "-Response_reg +GGDEF" - all proteins with a GGDEF AND NOT a Response_reg domain

    • "-Response_reg +GGDEF +EAL" - all proteins with a GGDEF AND an EAL domain AND NOT a Response_reg domain

  • Agfam: Agfam domain library. Similar boolean searching as for Pfam, with one major exception. Agfam domains have a primary name and possibly a subname corresponding to a particular class of domains (e.g. HK_CA:Che, corresponds to the chemotaxis group of the HK_CA primary domain). Searches may be performed using either the primary domain name by itself (e.g. HK_CA) or in conjunction with a specific subname (e.g. HK_CA:Che or HK:CA:3).

    Examples (constrained to Pseudomonas aeruginosa PA01

    • "+HK_CA +RR" - find all proteins with a HK_CA AND RR domain

    • "+HK_CA:Che" - find all proteins with the primary name, 'HK_CA' and subname of 'Che'

  • Pfam and/or Agfam: Find proteins with any combination of Pfam and/or Agfam domains. By default any terms will be searched against both the Pfam and Agfam domains; however, for faster performance and because of distinct and potentially overlapping domain names, it is possible to constrain search terms to a specific domain library. To restrict a term to Pfam, simply prefix the query term with 'pfam:'. Similarly, to restrict a term to Agfam, prefix it with 'agfam:'. Any boolean operators ('+' and '-') should be placed before the search term and before any prefixes.

    Examples (constrained to Vibrio cholerae O395

    • "-pfam:HATPase_c +agfam:HK_CA" - find all proteins with an Agfam HK_CA domain BUT NOT a Pfam HATPase_c domain

    • "+pfam:HATPase_c -agfam:HK_CA" - find all proteins with an Pfam HATPase_c domain BUT NOT an Agfam HK_CA domain

    • "pfam:HD agfam:RR" - find all proteins with a Pfam HD OR an Agfam RR domain

    • "+agfam:HK_CA:Che +pfam:CheW" - find all proteins that have with an Agfam HK_CA:Che AND a Pfam CheW domain

    • "+agfam:HK_CA +agfam:RR -pfam:HAMP +pfam:PAS" - find all proteins that have with an Agfam HK_CA AND Agfam RR domain AND NOT a Pfam HAMP domain AND a Pfam PAS domain

Agile Genomics, LLC

Developed and maintained by Agile Genomics, LLC © 2017

Hosted at: UTK || ulrich.luke+sci@gmail.com

Please let us know of any errors, misannotations, or other issues/comments