Abstract:
Drug repositioning, discovering new indications for existing drugs, is a competent strategy to reduce time, costs, and risk in drug discovery and development. Many computational methods have been developed to identify new drug-disease associations for further validation and drug development. A recent approach showing superior performance with less required data is a meta-path based approach, which derives network-based information using path patterns from drug to disease nodes. However, existing meta-path based methods discard information of intermediate nodes along paths, which are important indicators for describing relationships between drugs and diseases. With known (positive) and unknown (unlabeled) drug-disease associations, this research proposes a new meta-path based method under positive-unlabeled (PU) learning settings for predicting drug-disease associations. Gene ontology (GO) is utilized to connect between drugs and diseases in a drug-GO-disease tripartite network. From this network, new meta-path based features of drug-disease pairs, or meta-path based functional profiles, are created to incorporate GO information into the functional profiles. An ensemble model is trained on these functional profiles of both positive and unlabeled samples. Consequently, the proposed method significantly outperforms other existing methods with the mean values of Area Under Precision-Recall Curves (AUPRC) of 0.944 and Area Under Receiver Operating Characteristic curves (AUROC) of 0.930. Moreover, up to 38% of new drug-disease associations discovered by the proposed method were found in the database of clinical trials.