It is generally accepted that many functional proteins do not have well-defined folded structures. These so-called intrinsically disordered proteins (IDPs) are encoded abundantly in the human genome and are involved in a variety of biological process including cell signalling, cell cycle control, molecular recognition, nucleic acid transcription and replication, as well as the development of neurodegenerative diseases and cancer. The studies of IDPs is an emerging field of research and general rules for describing their conformational behaviour and mechanisms are still missing. Thus, expanding the amount of experimental data from different systems as well as developing new techniques to characterize their properties are essential to improve our knowledge about this family of proteins.
We are interested in using the state-of-the-art nuclear magnetic resonance techniques and other biophysical methods in combination with novel computational modelling to explore the structural propensity and dynamics of IDPs and the mechanism of their interaction with other proteins or nucleic acids at atomic resolution.
After a decade in the post-genome era, the determination of the functions of proteins encoded in DNA sequences is still one of the major challenges. It is a widely accepted concept that the function of a protein is determined by its three-dimensional structure. Numerous protein structures with their functional interpretations deposited in the Protein Data Bank (PDB) over last fifty years strongly support this idea. However, this structure-function paradigm has been reassessed extensively in recent years. From bioinformatics studies, intrinsically disordered proteins have been shown to be amply present in all kingdoms of life. It is estimated that approximately 50% of mammalian proteins contain long disordered regions (more than 30 residues), and approximately 25% of their proteins are expected to be fully disordered under physiological conditions. The lack of folded structure provides several advantages such as having a larger solvent exposed surface to enhance the chance of interacting with binding partners via the so-called “fly-casting” mechanism, as well as allowing them to act as scaffolds by interacting with different proteins. One of the most intriguing aspects of disordered proteins is that they often undergo structural transitions from disordered to folded forms upon binding to their physiological partners. This folding-upon-binding mechanism opens a new view of protein-protein and protein-DNA/RNA interactions. In spite of the advantages of being unstructured, the disorderedness of some of these proteins also leads to disease related aggregation or fibrillization. It is also estimated from bioinformatics studies that about 80% of cancer-associated proteins contain consecutive disordered regions. This new class of proteins is now mostly termed intrinsically disordered proteins (IDPs) or intrinsically disordered regions (IDRs) of structured proteins. With those key studies elucidating the importance of IDPs, “protein disorder” has become an emerging research field. From the accumulating amount of studies, it is now generally believed that IDPs play key roles in many physiological processes, including cell signalling, cell cycle control, molecular recognition, nucleic acid transcription and replication, as well as in the development of neurodegenerative diseases, cardiovascular diseases, amyloidoses, and type II diabetes.
Physiological and biochemical results have drawn our attention to the importance of IDPs, but several aspects about the mechanisms of IDP function are still unknown: How are IDPs recognized by the partner proteins in the absence of a folded structure? Does any specific pre-recognition conformation exist with their flexible nature? Can we derive a general rule to understand the conformational behaviour of these proteins from the primary sequence? In other words, can we predict the functions and mechanisms of IDPs from primary sequence? Insights into the dynamics and conformational propensities of these proteins at the atomic level will be a critical step on the way to answer these questions. Conventional approaches for structure determination or characterization is less feasible due to the structural heterogeneity of IDPs. Novel biophysical methods and computational models, therefore, become essential to overcome their rapidly inter-converting nature.
Our group is interested in using, nuclear magnetic resonance (NMR) spectroscopy, giving specific information for almost all atoms with minimal interference, to characterize IDPs. Particularly, two of the latest developed NMR techniques, residual dipolar couplings (RDCs) and paramagnetic relaxation enhancements (PREs), which are extremely sensitive to local conformational sampling and transient long-range interaction in unstructured proteins, will be applied to those systems studied. Other biophysical methods such as small angle X-ray scattering (SAXS), circular dichroism spectroscopy, and fluorescence spectroscopy will also be used as complementary methods. In addition, due to the heterogeneity of IDPs, a statistically significant computational model will be used to characterize the structural propensities of the IDPs. We are using experimental data as constraints to obtain representative conformational ensembles of IDPs. We are also developing new methods hopefully to predict the function of IDPs solely on the basis of primary sequence. We hope studies carried on in our group will improve our understanding of the structural dynamics, conformational behaviour, related biological processes, and the onset of pathological aggregation or fibrillization of IDPs.