A novel entropy-based mapping method for determining the protein-protein interactions in viral genomes by using coevolution analysis
Abstract
Protein-protein interactions have a vital role in DNA transcription, immune system, and signal transmission between cells. Determining the interactions between proteins can give information about the functional structure of a cell and the functions of target organisms. Protein-protein interactions are determined by experimental approaches, yet, there is still a huge gap in specifying all possible protein interactions in an organism. Furthermore, since these approaches use cloning, labeling, and affinity mass spectrometry, the analysis process is time-consuming and expensive. However, analyzing the protein interactions with computational approaches based on coevolution theory eliminate these kinds of limitations, since in the coevolution theory model, interacting proteins show coevolutionary mutations and form similar phylogenetic trees. Current coevolution methods are based on the multiple-sequence alignment process; yet many high false positive interactions arise with these methods. Therefore, it is important to perform computational-based coevolution analysis. Protein-protein interaction using coevolution analysis has been employed in conjunction with experimental approaches to explore new protein interactions. However, in order to predict protein interactions with computational-based coevolution analysis, protein sequences need to be mapped. There are various types of protein mapping methods belonging to certain categories in the literature. These methods are frequently used in studies of predicting protein interactions. In this study, as an alternative to these methods, we proposed a novel entropy-based protein mapping method and predicted protein-protein interactions in viral genomes by using coevolution analysis. The study consists of 5 stages. In the first stage, the protein sequences of viral genomes were mapped using both the proposed numerical mapping method and state-of-arts protein mapping methods. In the second stage, Fourier transform was applied to each mapped protein sequences. In the third stage, the distance matrix was generated by finding the distances between the proteins belonging to the same virus genome. In the fourth stage, Pearson correlation values between the distances were calculated and coevolution analysis was performed. In the last stage, the proposed mapping method was compared with state-of-arts protein mapping methods and MirrorTree approach. Coevolution analysis was performed on two different virus genomes; Ebola virus and Influenza A virus. With the proposed method, a high degree of correlation has been obtained between proteins of the Ebola virus. For Ebola virus, the lowest correlation result (0.75) was obtained between the NP-VP35 protein pair. The highest correlation (0.99) was observed between the NP-VP24 and NP-VP40 protein pairs. For Influenza A, the lowest correlation (0.09) was obtained between the M1-PA(X) protein pair with the proposed method. The highest correlation value (0.98) with the proposed method was calculated between the M1-M2 protein pair. The proposed method verified the interactions between protein pairs, which have been experimentally proven, with a high degree correlation value. These results indicated that the proposed method can be effective in predicting protein interactions.
Source
Biomedical Signal Processing and ControlVolume
65Collections
Related items
Showing items related by title, author, creator and subject.
-
Prediction of Protein-Protein Interactions with LSTM Deep Learning Model
Alakuş, Talha Burak; Türkoğlu, İbrahim (Institute of Electrical and Electronics Engineers Inc., 2019)Protein-protein interactions (PPI) has a vital role in molecular biology and bioinformatics since they are the key organisms which give information about cellular, its structure and its functions. In recent years many ... -
A novel Fibonacci hash method for protein family identification by using recurrent neural networks
Alakuş, Talha Burak; Türkoğlu, İbrahim (Tubitak Scientific & Technical Research Council Turkey, 2021)Identification and classification of protein families are one of the most significant problem in bioinformatics and protein studies. It is essential to specify the family of a protein since proteins are highly used in smart ... -
Ischemia-modified albumin and advanced oxidation protein products as potential biomarkers of protein oxidation in Alzheimer's disease
Altunoğlu, E.; Korkmaz, Gülcan Güntaş; Erdenen, Füsun; Akkaya, E.; Topaç, I.; Irmak, H.; Uzun, Hafize (Blackwell Publishing, 2015)Background: The aim of the present study was to determine the systemic levels of oxidative stress markers, such as ischemia-modified albumin (IMA), advanced oxidation protein products (AOPP), ferric reducing antioxidant ...