Annotated space Library
ChemDiv’s Annotated space Library
ChemDiv introduces new annotated space of 18 000 chemical compounds. It covers 38 validated targets across 900 drugs launched in the last 10+ years.
Compounds were designed and selected based on corresponding data, CATS descriptors and 3D conformations generated for drugs molecules and such targets.
A neural network was trained for the prediction of the target activity and compounds selection
Train set:
- A set of 900 drugs launched or in clinical trials over the past 10 years has been assembled (38 targets).
- For each structure 300 3D-conformations were generated.
- 100 the most divers conformations were selected for each sample.
- CATS 3D descriptors were calculated for all conformations.
Target set:
- Soft MCFs and Drug-likeness filters were applied to compounds available in the ChemDiv collection (>1.5 M cmpds)
- For each molecule 3D-conformation was generated in Corina Software - 256к.
- 100 conformations were generated for the starting conformation, and about 10 maximally different ones were selected (of total - 2.5 M samples).
- CATS 3D descriptors were calculated for all conformations.
Algorithm:
- A neural network was trained for the prediction of the target activity. Activity prediction (0 or 1) is performed by the model independently for each target.
- Using iterative optimization, individual binarization thresholds were selected to make the final decision for each target.
- Numerical metrics were calculated on a validation set of 10% of the training set. In addition, the quality of the model was assessed on 10 compounds from an independent sample.