Predicting binding sites from a protein's sequence has the potential for yielding high impact on life science research - if the predictions are specific and accurate enough to help addressing relevant biological questions. In CAMEO we plan to continuously assess ligand binding site predictions to evaluate the current state of the art of prediction methods, identify possible bottlenecks, and further stimulate the development of new methods.
In previous CASP experiments the very low number of challenging target structures with relevant ligands has been a major limitation to the assessment as it did not allow to draw significant conclusions on the specific strengths and weakness of different prediction methods. Further, the current ligand binding site prediction format used in CASP has a number of limitations. All ligands are treated uniformly, independent of their chemical type and all potential binding sites are treated uniformly, independent of their affinity for different ligands. Hence, in CAMEO we have modified the ligand binding site prediction format to allow a more fine-grained prediction and a more detailed assessment.
This new format consists of three sections separated by the "|" symbol:
Format details:
r=<resname>; n=<resnum>; [c=<chainname>;] [a=<atomname>;] | I=<ion prob>; O=<org prob>; N=<nucl prob>; P=<pep prob> | [<compound ID1>=<compound prob>;] [<compound ID2>=<compound prob>;] ...
r=SER; n=198; | I=0.000; O=0.000; N=0.000; P=0.000; | r=GLU; n=199; | I=1.000; O=0.000; N=0.000; P=0.000; | r=GLY; n=200; | I=0.000; O=0.000; N=0.000; P=0.000; | r=ALA; n=201; | I=0.513; O=0.000; N=0.000; P=0.000; |
r=GLU; n=170; a=N; | I=0.000; O=0.000; N=0.000; P=0.000; | ANP=0.000; MN=0.000; r=GLU; n=170; a=CA; | I=0.047; O=0.412; N=0.000; P=0.000; | ANP=0.412; MN=0.047; r=GLU; n=170; a=C; | I=0.337; O=0.668; N=0.000; P=0.000; | ANP=0.668; MN=0.337; r=GLU; n=170; a=O; | I=0.372; O=1.000; N=0.000; P=0.000; | ANP=1.000; MN=0.372; r=GLU; n=170; a=CB; | I=0.249; O=0.424; N=0.000; P=0.000; | ANP=0.424; MN=0.249; r=GLU; n=170; a=CG; | I=0.000; O=0.077; N=0.000; P=0.000; | ANP=0.077; MN=0.000; r=GLU; n=170; a=CD; | I=0.000; O=0.000; N=0.000; P=0.000; | ANP=0.000; MN=0.000; r=GLU; n=170; a=OE1; | I=0.000; O=0.000; N=0.000; P=0.000; | ANP=0.000; MN=0.000; r=GLU; n=170; a=OE2; | I=0.000; O=0.000; N=0.000; P=0.000; | ANP=0.000; MN=0.000; r=ASN; n=171; a=N; | I=0.331; O=0.353; N=0.000; P=0.000; | ANP=0.353; MN=0.331; r=ASN; n=171; a=CA; | I=0.401; O=0.307; N=0.000; P=0.000; | ANP=0.307; MN=0.401; r=ASN; n=171; a=C; | I=0.000; O=0.000; N=0.000; P=0.000; | ANP=0.000; MN=0.000; r=ASN; n=171; a=O; | I=0.000; O=0.000; N=0.000; P=0.000; | ANP=0.000; MN=0.000; r=ASN; n=171; a=CB; | I=0.528; O=0.251; N=0.000; P=0.000; | ANP=0.251; MN=0.528; r=ASN; n=171; a=CG; | I=0.987; O=0.584; N=0.000; P=0.000; | ANP=0.584; MN=0.987; r=ASN; n=171; a=OD1; | I=1.000; O=0.939; N=0.000; P=0.000; | ANP=0.939; MN=1.000; r=ASN; n=171; a=ND2; | I=0.859; O=0.637; N=0.000; P=0.000; | ANP=0.637; MN=0.859; r=LEU; n=173; a=N; | I=0.000; O=0.000; N=0.000; P=0.000; | ANP=0.000; MN=0.000; r=LEU; n=173; a=CA; | I=0.000; O=0.000; N=0.000; P=0.000; | ANP=0.000; MN=0.000; r=LEU; n=173; a=C; | I=0.000; O=0.000; N=0.000; P=0.000; | ANP=0.000; MN=0.000; r=LEU; n=173; a=O; | I=0.000; O=0.000; N=0.000; P=0.000; | ANP=0.000; MN=0.000; r=LEU; n=173; a=CB; | I=0.000; O=0.175; N=0.000; P=0.000; | ANP=0.175; MN=0.000; r=LEU; n=173; a=CG; | I=0.000; O=0.509; N=0.000; P=0.000; | ANP=0.509; MN=0.000; r=LEU; n=173; a=CD1; | I=0.000; O=0.898; N=0.000; P=0.000; | ANP=0.898; MN=0.000; r=LEU; n=173; a=CD2; | I=0.000; O=0.696; N=0.000; P=0.000; | ANP=0.696; MN=0.000;