TY - JOUR
T1 - The structure of the genetic code as an optimal graph clustering problem
AU - Błażej, Paweł
AU - Kowalski, Dariusz R.
AU - Mackiewicz, Dorota
AU - Wnetrzak, Małgorzata
AU - Aloqalaa, Daniyah A.
AU - Mackiewicz, Paweł
N1 - Funding Information:
This work was supported by the National Science Centre Poland (Narodowe Centrum Nauki, Polska) under Grant Miniatura no. 2017/01/X/NZ2/00608. Dariusz R. Kowalski acknowledges support from Networks Sciences & Technologies (NeST), University of Liverpool.
Publisher Copyright:
© 2022, The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature.
PY - 2022/7
Y1 - 2022/7
N2 - The standard genetic code (SGC) is the set of rules by which genetic information is translated into proteins, from codons, i.e. triplets of nucleotides, to amino acids. The questions about the origin and the main factor responsible for the present structure of the code are still under a hot debate. Various methodologies have been used to study the features of the code and assess the level of its potential optimality. Here, we introduced a new general approach to evaluate the quality of the genetic code structure. This methodology comes from graph theory and allows us to describe new properties of the genetic code in terms of conductance. This parameter measures the robustness of codon groups against the potential changes in translation of the protein-coding sequences generated by single nucleotide substitutions. We described the genetic code as a partition of an undirected and unweighted graph, which makes the model general and universal. Using this approach, we showed that the structure of the genetic code is a solution to the graph clustering problem. We presented and discussed the structure of the codes that are optimal according to the conductance. Despite the fact that the standard genetic code is far from being optimal according to the conductance, its structure is characterised by many codon groups reaching the minimum conductance for their size. The SGC represents most likely a local minimum in terms of errors occurring in protein-coding sequences and their translation.
AB - The standard genetic code (SGC) is the set of rules by which genetic information is translated into proteins, from codons, i.e. triplets of nucleotides, to amino acids. The questions about the origin and the main factor responsible for the present structure of the code are still under a hot debate. Various methodologies have been used to study the features of the code and assess the level of its potential optimality. Here, we introduced a new general approach to evaluate the quality of the genetic code structure. This methodology comes from graph theory and allows us to describe new properties of the genetic code in terms of conductance. This parameter measures the robustness of codon groups against the potential changes in translation of the protein-coding sequences generated by single nucleotide substitutions. We described the genetic code as a partition of an undirected and unweighted graph, which makes the model general and universal. Using this approach, we showed that the structure of the genetic code is a solution to the graph clustering problem. We presented and discussed the structure of the codes that are optimal according to the conductance. Despite the fact that the standard genetic code is far from being optimal according to the conductance, its structure is characterised by many codon groups reaching the minimum conductance for their size. The SGC represents most likely a local minimum in terms of errors occurring in protein-coding sequences and their translation.
KW - Code degeneracy
KW - Graph theory
KW - Set conductance
KW - Standard genetic code
UR - http://www.scopus.com/inward/record.url?scp=85134205834&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85134205834&partnerID=8YFLogxK
U2 - 10.1007/s00285-022-01778-4
DO - 10.1007/s00285-022-01778-4
M3 - Article
C2 - 35838803
AN - SCOPUS:85134205834
SN - 0303-6812
VL - 85
JO - Journal of Mathematical Biology
JF - Journal of Mathematical Biology
IS - 1
M1 - 9
ER -