Computational Structural Biology

Computatonal Structure Biology

  • i SARST

    iSARST, a server for efficient protein structural similarity searches, has long been the world's fastest system of its kind. It is a distributed-computing, batch-processing and integrated implementation of several structural comparison algorithms and two database searching methods: SARST for common structural homologs and CPSARST for homologs with circular permutations. iSARST allows users submitting multiple PDB/SCOP entry IDs or an archive file containing many structures. After scanning the target database using SARST/CPSARST, the ordering of hits are refined with accurate structure alignment algorithms such as FAST, TM-align and SAMO, by distributed computation. In this way, iSARST achieves a high running speed while preserving the high precision of refinement engines. iSARST provides the first batch mode structural comparison web service for both co-linear homologs and circular permutants. It can serve as a rapid prediction system for functionally unknown or hypothetical proteins.


    Circular permutation of a protein can be visualized as if the original amino- and carboxyl termini were linked and new ones created elsewhere. It has been well-documented that circular permutants usually retain native structures and biological functions. Here we report CPSARST (Circular Permutation Search Aided by Ramachandran Sequential Transformation) to be an efficient database search tool. In this post-genomics era, when the amount of protein structural data is increasing exponentially, it provides a new way to rapidly detect novel relationships among proteins.


    Three-dimensional domain swapping (DS), a structural phenomenon first clearly described in the mid 1990s, is a mechanism for protein oligomerization defined as two or more proteins exchanging part of their identical domain to form intertwined oligomers. Studying DS can help reveal the functional regulation, molecular evolution and structural dynamics of proteins, and help find applications in creating artificial biopolymers and treating deposition diseases like the Alzheimer's disease and bovine spongiform encephalopathy. Despite the increasing interest in DS, related bioinformatics methods are rarely available and there is still a lack of a comprehensive database for studying DS. The domains of a DS protein are connected by a hinge loop. Hinge loops have been proposed to be the causes and/or the results of DS; however, precise methods for their identification are not readily available. We have previously developed a rapid protein structural similarity search method called SARST and an accurate method ADiDoS for the detection of DS relationship between proteins. In this proposal, we have determined to improve and combine them into the first database searching system for DS-related proteins. This searching system, named DSSARST (3D Domain Swapping Search Aided by Ramachandran Sequential Transformation), will enable large-scale studies on this interesting phenomenon and facilitate its application in biotechnology.

  • CPred

    Circular permutation (CP) is a protein structural rearrangement phenomenon, through which nature allows structural homologs to have different locations of termini and thus varied activities, stabilities and functional properties. It can be applied in many fields of protein research and bioengineering. The limitation of applying CP lies in its technical complexity, high cost and uncertainty of the viability of the resulting protein variants. Not every position in a protein can be used to create a viable circular permutant, but there is still a lack of practical computational tools for evaluating the positional feasibility of CP before costly experiments are carried out. We have previously designed a comprehensive method for predicting viable CP cleavage sites in proteins. In this work, we implement that method into an efficient and user-friendly web server named CPred (CP site predictor), which is supposed to be helpful to promote fundamental researches and biotechnological applications of CP.

  • CPDB

    CPDB, or Circular Permutation DataBase, is the first large scale structure database for circularly-permuted proteins. The organizational principle of CPDB is a hierarchical categorization in which pairs of circular permutants are grouped into CP clusters, which are further grouped into folds and in turn classes. Additions to CPDB include a useful set of tools and resources for the identification, characterization, comparison and visualization of CP. Besides, several viable CP site prediction methods are implemented and assessed in CPDB. This database can be useful in protein folding and evolution studies, the discovery of novel protein structural and functional relationships, and facilitating the production of new CPs with unique biotechnical or industrial interests.