Construction of Comprehensive Protein-Analysis Database and its Applications

Genomic Sciences Center, RIKEN Yokohama Institute
○Ryogo Akasaka Machiko Hirafuji-Yamaguchi Akiko Urushibata Kazutaka Murayama Chie Takemoto-Hori Noboru Ohsawa Mutsuko Kukimoto-Niino Kanna Motoyama Mari Aoki Takaho Terada Mikako Shirouzu Shigeyuki Yokoyama

 The comprehensive protein-analysis database is essential for high-throughput protein analyses, because the number of those will become enormous, so that it will make difficult to reach necessary data. It is not only gathering data, but also it enables researchers to find out necessary data immediately and to extract easily statistics of protein crystallization conditions for instance.
We constructed the protein-analysis database for administrating or examination of many measured data for proteins. The results were recorded by researchers who examined each protein with analytical systems that are mass spectrometry (MALDI-TOF and Q-TOF), dynamic/static light scattering (DLS/SLS), electrophoresis, etc.
To investigate the protein crystallization conditions, we considered DLS and Native-PAGE data. These have been applied to evaluate protein samples in solution. In general, if the value of polydispercity of DLS measurement is very high, it is difficult to crystallize. In addition, contamination and/or aggregation, which are detected in electrophoresis as sub-bands, are also not suitable for crystallization. DLS and Native-Page analysis can judge sample condition whether sample has sufficient quality for crystallization. Analyzing these data, we examined the efficiency of crystallization. As a result, it is found that tight relationship between crystallization conditions and analytical values (the polydispercity of DLS and the presence of contamination/aggregations of samples).
As a new application of this crystallization database, our co-worker has already applied it for prediction of crystallization conditions, and reported in CSJ2005 (F. Konishi et al). We hope that the database can be used not only for a data platform but also for a new data mining tool.