SysInformatics

Recall that the SPT Founder/Author trained as a cell and molecular biologist and spent his career in that practical enterprise, even becoming a Biology Dept. Chair for a period. So it is not surprising that like Biomimicry becomes Systems Mimicry, and Bioallometry becomes Systems Allometry, that BioInformatics becomes SysInformatics. Just a simple confession.

Biology, especially in the genomics age, has accumulated terabytes of gene sequence data per day. Thousands of entire genomes have been published. This has greatly stimulated progress in how to handle such vast amounts of data by any human being. It has necessitated the explosion of development of interdisciplinary work between computer science and biology.

Like the explosion of data in bio-genomics (one should not ignore that a similar but less manageable amount of data has arisen in ecosystems), there is a similar explosion in data on the systems-level view of all of the entities in the universe. Here is a slide estimating the amount of data in the SPT alone:

Bioinformatics enjoyed immediate success (recognition, funding, jobs, innovation) because it answered a pressing need. Biology from molecular to physiological to genetic to neuroscience domains erupted with vast quantities of data. Would the products of SP3 lead to significantly increased data? Only a gross estimate can be provided at this early date. We start with over 115 candidate systems processes in our SP3 study, each needing 26 categories of information. While some of these categories have only a dozen initial entries in them, others have thousands of items which amounts to > quarter million items to keep track of and apply for the systems processes alone. We anticipate that computerized repositories will also include hundreds of Linkage Propostions (LPs). Each LP will require many dozens of independent citations from the peer reviewed natural science literature. Combined with the SP information load, the entity density thenexceeds a billion items. This exceeds human brain capacity and shows the need to manage and curate a vast database. This emerging challenge is not new. It is part of the expected natural evolution of sciences that study complex systems. Astronomy, space science and physics faced it long ago with vast streams of data being produced by probes, telescopes, and sub-atomic particle accelerators. How do you make such immense data sets or knowledge bases manageable for humans? How do you find patterns within them? That is the goal of sysinformatics.