Abstract:
During the age of deep learning technologies, which have exhibited significant potential in reducing costs and expediting medical development, predicting molecular properties has become a prevalent task that capitalizes on the capabilities of deep learning. This thesis proposed a multimodal Graph Neural Network (GNN) model that utilizes the topology information obtained from molecular graphs through a baseline GNN, facilitating precise property predictions. The thesis improves the baseline CMPNN model by exploring various methods to address potential missing gaps. These methods include incorporating the multimodal module, such as a Bidirectional LSTM module capable of processing text sequences in SMILES format or a spectral graph convolution module. Moreover, self-attention integration into the CMPNN model was implemented using the alpha coefficient method from GATConv. The experimental results show that the proposed multimodal GNN models performed better than the baseline model for predicting molecular properties in seven out of eight datasets from MoleculeNet, including five classification and three regression tasks. These findings show the potential of this methodology across various domains within the field of chemistry, with particular relevance to drug discovery.