Validated targets annotated space library
ChemDiv’s Validated targets annotated space library
ChemDiv introduces new annotated space of 18 000 chemical compounds. It covers 38 validated targets across 900 drugs launched in the last 10+ years.
Compounds were designed and selected based on corresponding data, CATS descriptors and 3D conformations generated for drugs molecules and such targets.
A neural network was trained for the prediction of the target activity and compounds selection
Train set:
- A set of 900 drugs launched or in clinical trials over the past 10 years has been assembled (38 targets).
- For each structure 300 3D-conformations were generated.
- 100 the most divers conformations were selected for each sample.
- CATS 3D descriptors were calculated for all conformations.
Target set:
- Soft MCFs and Drug-likeness filters were applied to compounds available in the ChemDiv collection (>1.5 M cmpds)
- For each molecule 3D-conformation was generated in Corina Software - 256к.
- 100 conformations were generated for the starting conformation, and about 10 maximally different ones were selected (of total - 2.5 M samples).
- CATS 3D descriptors were calculated for all conformations.
Algorithm:
- A neural network was trained for the prediction of the target activity. Activity prediction (0 or 1) is performed by the model independently for each target.
- Using iterative optimization, individual binarization thresholds were selected to make the final decision for each target.
- Numerical metrics were calculated on a validation set of 10% of the training set. In addition, the quality of the model was assessed on 10 compounds from an independent sample.