‘Omics, Bioinformatics, Computational Biology

Home / MAPP / Emerging Technologies / ‘Omics, Bioinformatics, Computational Biology

Emerging Technologies

‘Omics, Bioinformatics, Computational Biology

Last updated: July 3, 2014

This section describes emerging technologies for understanding the behavior of cells, tissues, organs, and the whole organism at the molecular level using methods such as genomics, proteomics, systems biology, bioinformatics, as well as the computational tools needed to analyze and make sense of the data. These technologies have the potential to facilitate the development of a predictive toxicology based on models built with existing in vivo data (animal and human), as well as new and existing in vitro and in silico data.

-Omics

Technologies that measure some characteristic of a large family of cellular molecules, such as genes, proteins, or small metabolites, have been named by appending the suffix “-omics,” as in “genomics.” Omics refers to the collective technologies used to explore the roles, relationships, and actions of the various types of molecules that make up the cells of an organism.

These technologies include:

  • Genomics, “the study of genes and their function” (Human Genome Project (HGP), 2003)
  • Proteomics, the study of proteins
  • Metabonomics, the study of molecules involved in cellular metabolism
  • Transcriptomics, the study of the mRNA
  • Glycomics, the study of cellular carbohydrates
  • Lipomics, the study of cellular lipids

Omics technologies provide the tools needed to look at the differences in DNA, RNA, proteins, and other cellular molecules between species and among individuals of a species. These types of molecular profiles can vary with cell or tissue exposure to chemicals or drugs and thus have potential use in toxicological assessments. Omics experiments can often be conducted in high-throughput assays that produce tremendous amounts of data on the functional and/or structural alterations within the cell. “These new methods have already facilitated significant advances in our understanding of the molecular responses to cell and tissue damage, and of perturbations in functional cellular systems” (Aardema & MacGregor, 2002).

The -omics technologies will continue to contribute to our understanding of toxicity mechanisms. Regulators are interested in these new technologies but are still sorting out how to incorporate the new information and technologies in regulatory decision making. For example, the US Food and Drug Administration’s Pharmacogenomic Data Submissions guidance document encourages the voluntary submission of genomics data but notes that the field of pharmacogenomics is still in its early developmental stages.

Bioinformatics

Bioinformatics is “the science of managing and analyzing biological data using advanced computing techniques” (HGP, 2003). Bioinformatics tools include computational tools that mine information from large databases of biological data. These tools are most commonly used to analyze large sets of genomics data. However, bioinformatics tools are also being developed for other types of biological data, such as proteomics.

The US National Center for Biotechnology Information (NCBI) serves as an integrated source of genomics information and bioinformatics tools for researchers. An important bioinformatics tool available at NCBI for proteomics and genomics is the Basic Local Alignment Search Tool (BLAST), which compares gene or protein sequences against databases that contain many archived sequences, in order to find regions of local similarity. The statistical significance of the sequence matches is then calculated, and the results can be used to infer functional and evolutionary relationships.

Computational Biology

Bioinformatics and databases of biological information can be used to generate “maps” of cellular and physiological pathways and responses. This integrative approach is called computational biology. “Bioinformatics is used to abstract knowledge and principles from large-scale data, to present a complete representation of the cell and the organism, and to predict computationally systems of higher complexity, such as the interaction networks in cellular processes and the phenotypes of whole organisms” (Bayat, 2002).

Systems Biology is an integration of data from all levels of complexity (genomics, proteomics, metabolomics, and other molecular mechanisms)  using “advanced computational methods to study how networks of interacting biological components determine the properties and activities of living systems” (HGP, 2003). The goal is to create overall computational models of the functioning of the cell, multicellular systems, and ultimately the organism. These in silico models will provide virtual test systems for evaluating the toxic responses of cells, tissues, and organisms.

Compounds will be tested in simulation studies before being applied to cells and tissues to obtain comparative results and validation of the system.

Useful Concepts and Terms

Toxicity Pathways: “Today, cell biologists working in many fields are enhancing knowledge of cellular-response networks and elucidating the manner in which environmental agents perturb pathways to cause changes in cell behaviors. The NRC report defined toxicity pathways as biologic pathways that, when sufficiently perturbed, can lead to adverse health outcomes. Despite this new terminology, toxicity pathways are actually normal cellular-response pathways that can be targeted by environmental agents. A parallel exists in the field of carcinogenesis, in which genes that code for proteins involved in cell growth are designated as oncogenes or tumor suppression genes” (Andersen et al., 2008).

Adverse Outcome Pathway (AOP): The term AOP was developed as a framework for translating the mechanistic information derived from molecular, biochemical, and computational studies into endpoints that can be used to support chemical risk assessments. “An AOP is a conceptual construct that portrays existing knowledge concerning the linkage between a direct molecular initiating event and an adverse outcome at a biological level of organization relevant to risk assessment” (Ankley et al., 2010).

Genomics: The first of the -omics technologies to be developed, genomics has resulted in massive amounts of DNA sequence data requiring great amounts of computer capacity. Genomics has progressed beyond sequencing of organisms (structural genomics) to identifying the function of the encoded genes (functional genomics).

The genome of each species is distinctive, but smaller genomic differences are also observed between each individual of a species. It was originally thought that obtaining the sequence of the human genome would immediately tell us the identity of the human genes. The genome has proved to be much more complex.

When a gene is expressed it results in the production of a messenger RNA and ultimately a particular protein. Gene expression is not fully understood, but involves regulatory sequences within the DNA and the binding of specific regulatory proteins to these sequences. The expression and regulation of the regulatory proteins is another level of control. Whether a particular gene is expressed in an organism can be influenced by various genetic and environmental factors.

The DNA sequences of a gene that code for a protein are called exons, and they are interspersed with DNA called introns, which do not code for proteins. The intron sequences, previously thought to be nonsense material, are now known to also contain important information. Although the sequencing of the human genome was completed in 2003 (HGP, 2003), the identification of all of the genes within the human DNA sequence is not complete. Locating the beginning and ends of genes within the DNA remains a challenge.

Gene annotation is “adding pertinent information such as gene coded for, amino acid sequence or other commentary to the database entry of raw sequence of DNA bases” (HGP, 2003). This involves describing different regions of the code, identifying which regions can be called genes, and identifying other features such as exons and introns, start and stop codons, and so on.

For more background information on DNA, genes, and genome sequencing, take a look at the online book What’s a Genome?.

Epigenetics: Epigenetics refers to mechanisms that persistently alter gene expression without actual changes to the gene/DNA sequence. DNA methylation is an example of an epigenetic mechanism. Scientists have shown that DNA methylation is an important component in a variety of chemical-induced toxicities, including carcinogenicity, and is a mechanism that should be assessed in the overall hazard assessment (Watson & Goodman, 2002; Moggs, et al., 2004).

Proteomics: Proteins are the primary structural and functional molecules in the cell, and are made up of a linear arrangement of amino acids. The linear polypeptide chains are folded into secondary and tertiary structures to form the functional protein. Unlike the static nature of the cell’s genes, proteins are constantly changing to meet the needs of the cell.

Characterizing the identity, function, regulation, and interaction of all of the cellular proteins of an organism, the proteome, will be a major achievement. Studies of changes in the proteome of cells and tissues exposed to toxic materials, compared to normal cells, is being used to develop an understanding of the mechanisms of toxicity. As proteomics tools become more powerful and widely used, protein and proteome changes in response to exposures to toxic substances (fingerprints or response profiles) will be developed into databases that can be used to classify exposure responses at various levels of organization of the organism, thus providing a predictive in silico toxicology tool.

Metabolomics: Metabolomics refers to the comprehensive evaluation of the metabolic state of a cell, organ or organism, in order to identify biochemical changes that are characteristic of specific disease states or toxic insults. Typical metabolomics experiments involve the identification and quantitation of large numbers of endogenous molecules in a biological sample (e.g., urine or blood) using chemical techniques such as chromatography and mass spectrometry. The output from these techniques is compared to computerized libraries of mass spectrometry tracings to facilitate identification of the compounds that are present. Environmental stresses such as exposure to chemicals or drugs alter the metabolic pathways in cells, and metabolite profiling can be used to assess toxic responses/exposures.

Biomarkers: Broadly defined, biomarkers are “characteristics [typically a biomolecule(s)] that can be objectively measured and evaluated as an indicator of normal biologic or pathogenic processes or pharmacological responses to a therapeutic intervention” (Cummins, 2007). Animal models are still commonly used to look for biomarkers relevant to human drug development, toxicity responses, and disease processes. To develop useful human biomarkers for toxicity, cell and tissue models that can express known biomarkers of toxicity need to be developed and validated against clinical samples. One challenge with toxicity biomarkers is that humans cannot be purposefully exposed to toxic materials to obtain clinical samples.

Relation of DNA (genes) to Proteins: Each gene is a linear stretch of DNA nucleotides that codes for the assembly of amino acids into a polypeptide chain (protein). DNA is transcribed into messenger RNA (mRNA) (transcription) which is then translated by the ribosomes into the amino acid chains that will make up the protein (translation).

Mutations are changes in DNA bases (insertions, deletions, translocations) that may result in changes to the proteins that are synthesized, or even prevent their synthesis. Chemicals that are mutagens can cause permanent heritable changes in the DNA sequence.

Regulation of Gene Expression: Some proteins are constitutively expressed (present all of the time), but cells can regulate the expression of proteins that are not needed all of the time or in large amounts. This provides cells with control mechanisms for turning metabolic reactions on and off. Cells use a variety of mechanisms to regulate gene expression, and thus which proteins are produced. Proteins can be controlled or regulated at the level of their synthesis (regulation of gene transcription), gene translation, various post-translation mechanisms and feedback inhibition, or the recently discovered actions of RNAi and microRNA.

Short interfering RNA (siRNA) are short double-stranded RNAs (dsRNA) that can regulate gene expression. In eukaryotic cells, the enzyme Dicer produces siRNA from small dsRNAs. The siRNA can bind to its complementary messenger RNA (mRNA) and inhibit translation and/or induce the cell to destroy the mRNA. The phenomenon is called RNA inhibition (RNAi), and can be used in the lab to inhibit any gene in any kind of cell (Dove, 2007). “RNA interference has re-energized the field of functional genomics by enabling genome-scale loss-of-function screens in cultured cells” (Echeverri & Perrimon, 2006).

MicroRNA (miRNA) is a recently discovered class of small non-coding RNAs. Cells use miRNA to regulate the amount of protein synthesized by a gene by the mechanisms of translational inhibition and mRNA destabilization (Bushati & Cohen, 2007). Over 250 miRNAs have been discovered.

Microarrays: Genomics and proteomics research has been advanced through the development of experimental techniques that increase throughput, such as microarrays. Microarrays consist of DNA or protein fragments placed as small spots onto a slide, which are then used as “miniaturized chemical reaction areas” (HGP, 2003). The studies typically involve looking for changes in gene or protein expression patterns by cells or tissues under different conditions. Microarrays provide a platform for evaluating the changes in many (usually thousands of) genes or proteins simultaneously.

High Throughput Screening (HTS) consists of assays developed to produce and analyze many individual data points or results in one experiment. Assays using DNA or other microarrays or multiwell plates of cells that are processed using robotic systems are examples of HTS assays. The US National Toxicology Program identified HTS as an essential tool for screening the thousands of chemicals currently in the US marketplace for potential human toxicity.

Toxicogenomics compares the genes expressed in organisms that have been exposed to a drug, chemical, or toxin to those of unexposed organisms (negative controls). The up or down regulation of certain genes or groups of genes may be linked to toxic responses occurring in the organism, and to particular organs or cell types in that organism. The goal of toxicogenomics is to identify patterns of gene expression related to specific chemicals or chemical classes so that these expression patterns can be used as endpoints for assessing toxicity. Thus far, toxicogenomics has been useful in refining animal experiments and identifying mechanisms of toxicity in lab animals where exposures can be controlled. There have also been experiments evaluating gene expression in cell cultures exposed to toxicants, which has been used in limited applications for prediction of in vivo toxicity.

Pharmacogenetics looks at the differences in response to a particular drug that are due to variations in the genetic makeup of individuals. For example, human genetic variation has been implicated in the variability of responses (effectiveness and/or toxicity) seen with some chemotherapeutic drugs (Crews, 2006; Hahn, et al., 2006).

Author(s)/Contributor(s):
Sherry L. Ward, PhD, MBA
AltTox Contributing Editor

AltTox Editorial Board reviewer(s):
George Daston, PhD
Procter & Gamble