ChemMine Tutorial
General Functionality
ChemMine is an integrated database that consists of a compound mining environment and a screening database. The main functionality of the two database components are summarized here:
Compound Database: [Annotation Search] [Structure Search]
ChemMine's compound database provides access to over 6,200,000 compounds from a wide variety of bioactive, natural and screening compound sources from public and commercial providers. A detailed list of all available compound sets is available on the Compound Source Page. Their structures and functional annotations can be searched by chemical properties, substructure matches, structural similarities and biological activities.
Screening Database: [Browse Screens]
Searching ChemMine
1. Annotation Searches
The Annotation Search page provides access to fast full-text and field-specific searches of all annotation data associated with compounds. By utilizing the Library selection menu users can delimit their searches to specific compound libraries. The Field menu allows to execute the following search types for query strings provided in the Search field:
This option allows powerful full-text searches against all annotation fields in the database. To test this search function, the term 'herbicide' is a good sample query. The search results can be seen under the "Search in annotation" tab.
Compound ID queries in single or batch mode are possible by providing as many compound IDs as required. Space or line separated formats are supported. If known, the proper compound set should be selected in the Library menu. [ Compound ID Search ]
To search compounds based on their physicochemical properties, the Annotation field needs to be selected. Then in the following "Physicochemical Property Constrains" section select exact property values or ranges of property values as search parameter. For example, the query "LGP > 0.4", "LGP < 0.5" will return all compounds in a library with a logP value (octanol/water partition coefficient) between 0.4 and 0.5. In addition, it is possible to use different descriptors in the same query. However,this will slow down the search process significantly. For instance, the query "LGP > 0.4", "N > 2" will return compounds that have a logP of greater than 0.5 and at least 2 N atoms with much longer search time.
Screening compounds often have plate and well locations. To search them, first select either the Plate identifier or the Well identifier field. Then in the "Plate and/or Well Constraints" section enter the plate and/or well data. For example, after select Plate identifier field, enter "1" in the plate space and "A03" in the well space. This query will return the compound that matches the requested plate and well location.
In addition to selecting the search methods, user can choose how the results of a query should be presented by selecting one of three options from the view menu:
Provides the matching query results with links to detailed compound annotation pages.
Provides the compound structure image along with detailed physicochemical property data of the corresponding compound. It also links to detailed compound annotation pages.
It displays the compound structure image. By clicking on the structure image, the detailed compound annotation pages will be displayed.
2. Structure Searches [Structure Search]
Structure similarity searches are the most important functionality for compound queries in databases. They allow the retrieval of all those molecules in a database that contain a user-defined query substructure, irrespective of the structural environment. Multiple libraries can be searched simultaneously. The following search functions are available for exploring the chemical space in ChemMine efficiently. Query structures for searching can be provided in SMILES or SDF formats. Alternatively, they can be generated by drawing a query molecule with the available JME Molecular Editor from Peter Ertl.
Again,user can choose how the results of a query should be presented by selecting one of three options from the view menu.
Compound Annotation Pages
The query results in ChemMine are structured into three different levels: 'Standard View', "Extended View", and "Grid View". From any of these three View page user can access the detailed 'Compound Annotation'(Example: H2O Compound Annotation) page by clicking on a specific match result. The initial three 'View' pages are for batch viewing and selection purposes, while the Compound Annotation page provide much more detailed information for individual compounds. These include the following annotation fields and download options:
- Color images of the compound structures
- Download of the compound structures in SMILES, InChI, SDF and other formats.
- Query for compounds with similar structures by clicking on the box 'Search Similar Compound'.
- The physicochemical property descriptors from JOELib.
- The available screening data are provided at the bottom of the page. They can be viewed in all details by clicking the Scree ID number. On this page under "Data Summary", it provided confidence scores of the screening data consist of 5 integer values reaching from 0 to 5. The value 0 represents inactive compounds, and the values 1 through 3 are assigned to compounds that show activity in the primary, the secondary screen and/or all follow-up experiments, respectively. The value 4 is assigned to compounds for which a target site has been identified, whereas the value 5 is assigned to compounds which are selective for the identified target site.
3. Viewing and Upload of Screening Data
ChemMine's screening database is a versatile publication and management system for diverse compound bioactivity and screening data. It supports any type of annotation information, such as tables, images, etc. The database provides users with several possibilites to access the screening data. These include the following possibilities:
- Browse Screening Data
- Search Screening Data
- Upload Screening Data
To obtain an overview of the available screening data sets, users can browse and search the screens by clicking the titles of the screens on this overview table. The available search field on the same page allows full-text searching in all screening data sets.
All screening data can be searched in ChemMine using full-text searches on the Annotation Search page or structure similarity searches on the Structure Search page.
A detailed tutorial for uploading and managing screening data in ChemMine is available on the Screen Upload page. To upload screens, users are required to creat a ChemMine account. This registration is necessary to communicate with users during the upload, approval and curation steps of new screening data sets.