Non-test Approaches: (Q)SARs, Read-Across

Home / MAPP / Emerging Technologies / Non-test Approaches: (Q)SARs, Read-Across

Methods, Approaches, Programs & Policies

Non-test Approaches: (Q)SARs, Read-Across

Last updated: November 3, 2014


(Q)SARs are models aimed at predicting the physicochemical and biological properties of molecules. A structure-activity relationship (SAR) is a (qualitative) association between a chemical substructure and the potential of a chemical containing the substructure to exhibit a certain biological effect (e.g., (eco)toxicological effect).  A (quantitative) structure-activity relationship ((Q)SAR) is a statistically established correlation relating a quantitative parameter(s) derived from chemical structure, or determined by experimental chemistry, to exhibit a quantitative measure of biological activity. Expert systems are built upon experimental toxicity results with rules derived from the data. The rules may be based on statistical inference and take the form of (Q)SARs (e.g., TOPKAT), on expert judgment and take the form of SARs describing reactive chemistry (e.g., Derek for Windows), or they may be a hybrid of the two (e.g., TIMES). The benefits of using (Q)SAR approaches include their relative low cost, speed, and potential to minimize animal testing.

The Organisation for Economic Co-operation and Development (OECD) has described (Q)SAR as “a quantitative (mathematical) relationship between a numerical measure of chemical structure, and/or a physicochemical property, and an effect/activity [that] often take[s] the form of regression equations, and can make predictions of effects/activities that are either on a continuous scale or on a categorical scale…. In many cases, (Q)SARs are quantitative models of key mechanistic processes which result in the measured activity of the chemicals” (OECD, 2007).

Uptake of (Q)SARs has remained largely limited to those models developed for properties such as aquatic toxicity, physicochemical properties or environmental fate. Such (Q)SAR models have proved particularly useful in prioritizing chemicals. More recently under regulatory programs such as REACH, they have been used extensively to fill data gaps for hazard characterization. (Q)SARs for human health effects remain best used as part of weight of evidence assessment rather than as standalone replacements for animal tests (Patlewicz, et al., 2011). This background article will provide a brief introduction to (Q)SAR models developed for the assessment of human health toxicity endpoints. In addition, many of the Toxicity Endpoints & Tests sections of AltTox discuss endpoint-specific (Q)SAR applications.

In general, the process of (Q)SAR development may be described by a series of steps. A set of chemicals with corresponding biological activity data are collected. The chemicals are characterized by numerical representations called descriptors, and statistical techniques are typically then applied to derive an algorithm that relates the relevant chemical information to biological activity. Access to good quality data is obviously a critical requirement for (Q)SAR development. As noted by Schultz and Seward (2000), the development of useful (Q)SARs for ecotoxicity endpoints resulted from the availability of sufficient data for the construction and validation of the computational models. (Q)SAR models for large scale screening of chemicals and pharmaceuticals for mutagenic potential have also been developed, aided by the underlying microbial mutagenicity data (Contrera et al., 2005).

However, when it comes to developing databases of human toxicity endpoints for (Q)SAR models, the amount and quality of the data needed for model building are often insufficient. Collecting additional whole animal toxicity data is not always feasible or practical. Mechanistic differences between the test system and the human species are an additional factor to consider. Human data would be most useful but is often not available. Efforts directed toward model building based on molecular toxicological endpoints are now being explored as a promising way of providing a sufficient amount of quantifiable and reliable data for developing human-predictive models. Predictive models based on (Q)SARs that use these types of validated surrogate endpoints will also have to take account of biokinetic and metabolism effects (Schultz & Seward, 2000).

Skin sensitization is one human health-effect endpoint where (Q)SAR models have shown particular promise, in part due to the availability of good quality data. Gerberick, et al. (2005) compiled a database of quality in vivo mouse local lymph node assay (LLNA) data on 211 chemicals for the purpose of accelerating the development and validation of new skin sensitization approaches. This was followed-up with a second compilation by some of the same authors (Kern et al., 2011). Many of the existing (Q)SAR models for skin sensitization fall into one of two main categories: either they are local in nature, usually specific to a chemical class or reaction chemical mechanism, or they are global in form, derived empirically using statistical methods. Some of these global (Q)SARs were recently characterized and shown to be of limited value in safety assessment (Patlewicz et al., 2007a; Roberts et al., 2007a). The strong mechanistic understanding of skin sensitization has facilitated the development of Relative Alkylation Index (RAI) models (Roberts & Williams, 1982). The RAI approach, relying on reactivity and hydrophobicity as the key parameters, actually appears to be the most promising means of deriving robust and mechanistically interpretable models that can be applied in a risk assessment context. These are now referred to as Quantitative Mechanistic Models (QMM). A strategy of how information from the RAI approach can be used in the evaluation of skin sensitization potential has been described in more detail elsewhere (Aptula & Roberts, 2006; Patlewicz et al., 2007b; Roberts et al., 2007b).

With the evolution of the “adverse outcome pathways” (AOPs) conceptual framework (OECD, 2011), the nature of (Q)SARs will likely change. Instead of relating chemical descriptors to an ultimate adverse outcome such as skin sensitization, it is feasible that next generation (Q)SARs will likely be developed to characterize molecular initiating events (MIEs) or other upstream key events. Indeed the construct of AOPs has already seen a shift in how expert systems undergo refinement. The hybrid expert system, TIMES, has used the AOP framework to develop new models to predict in vivo genotoxicity through an understanding of metabolism as well as the scope and performance of current in vitro genotoxicity assays (Mekenyan et al., 2012). TIMES has also exploited the concept of common MIEs to propose refinements in its genotoxicity models based on insights derived from skin sensitization data (Mekenyan et al., 2010). TIMES models have also been implemented into a Pipeline construct to mimic the AOP construct. Examples have been published for both skin sensitization (Patlewicz et al., 2014) and respiratory sensitization (Mekenyan et al., 2014).


“Read-across” is a “non-testing” approach that has gained much attention in recent years in particular for the way in which it can be used as a means of fulfilling data gaps under REACH, which aims to fill data gaps for thousands of chemicals under short deadlines. In fact, while the read-across concept has been utilized for many years in the HPV program under OECD or EPA, REACH has really provided the impetus to both update and harmonize the available technical guidance.

In applying read-across, endpoint information for one substance (source analogue) is used to predict the same endpoint for another substance (target), which is considered to be similar in some way (usually on the basis of structural similarity, though not exclusively so). In principle, read-across can be used to assess physicochemical properties, toxicity, environmental fate, and ecotoxicity. Moreover, for any of these endpoints, it may be performed in a qualitative or quantitative manner.

Read-across is performed to address specific data gaps as part of an analogue approach or category approach. A chemical category is a group of chemicals whose physicochemical and human health, and/or ecotoxicological properties, and/or environmental fate properties, are likely to be similar or follow a regular pattern, usually as a result of structural similarity.

The similarities may be based on the following:

  • a common functional group (e.g. aldehyde, epoxide, ester, specific metal ion)
  • common constituents or chemical classes, similar carbon range numbers
  • an incremental and constant change across the category (e.g. a chain-length category)
  • the likelihood of common precursors and/or breakdown products, via physical or biological processes, which result in structurally similar chemicals (e.g. the metabolic pathway approach of examining related chemicals such as acid/ester/salt)

An analogue approach is a limited category, usually two substances – one target and one source substance.

Technical Guidance
Guidance documents provide clear definitions of what is meant by analogue, category, and read-across approaches, highlight the interrelationship between these and (Q)SARs, and provide specific guidance on special types of categories. It is worth noting that the drafting group responsible for developing the OECD and REACH guidance was one and the same; the OECD guidance was merely published a few months earlier.

While the guidance was intended to provide more practical insights on how to develop categories and perform read-across, it failed to provide any illustrative case studies on what constituted sufficient and adequate justification, or how to document such a justification. A reporting format is provided as a template within the guidance itself, but no example is available to indicate what level of detail might be required under the respective regulatory programs.

Indeed, while under REACH there is a need to exploit read-across, there are still many hurdles in terms of characterizing what acceptable and credible read-across represents. The lack of practical examples has contributed to that lack of acceptance as has an absence of a framework to characterize the scientific confidence associated with a read-across. Industry has made efforts to outline what gaps exist in their understanding of read-across, and more recently what the guiding principles may be. An ECETOC Task Force on read-across, category, and (Q)SARs published Technical Report TR116 (2012) to document the current status of approaches. At the same time, ECHA considered how to systematically evaluate read-across assessments and started to develop a read-across assessment framework (RAAF) for human health endpoints. A workshop between ECHA and Cefic LRI (European Chemical Industry Council’s Long-Range Research Initiative) was held in October 2012 to share experiences and discuss some of the on-going issues in utilizing read-across. Further information can be found in Patlewicz, et al. (2013). Elements of the RAAF could be published in the future as part of on-going practical and technical guidance.

The workshop organizing committee from Cefic LRI refocused its efforts to establish a read-across team to continue the dialogue of read-across acceptance and enhancement. It is formulating scientific confidence principles for read-across development and evaluation using elements of the RAAF and a framework already published by Blackburn and Stuard (2014). Elements of this framework were presented at QSAR 2014 and the Ninth World Congress for Alternatives and Animal Use in the Life Sciences. There are other complementary activities aimed at exploring the role in which Tox21, including AOPs, can contribute towards addressing uncertainty in read-across, thus promoting acceptance. One of these is the EU SEURAT program, which is considering read-across case studies, and another is a Cefic LRI project called AIMT-4, which is exploring read-across enhancement using an integration of different data types from -omics to classical toxicity information. A CAAT read-across program is also being developed with various stakeholders from industry and academia to explore the practical opportunities of read-across to facilitate the 2018 REACH registrations and beyond in terms of acceptance of read-across in other regions.

A number of tools and projects have been undertaken to help facilitate read-across approaches. The OECD, in particular, has made significant effort to promote regulatory applications of non-testing approaches such as (Q)SARs and read-across. Under its (Q)SAR program, OECD initiated the development of the OECD QSAR Toolbox to make (Q)SAR technology readily accessible, transparent, and less demanding in terms of infrastructure costs. The QSAR Toolbox, currently at version 3.2, assists in the development, evaluation, justification, and documentation of read-across within analogue and category approaches. The Toolbox is currently best suited for read-across of “simpler” endpoints such as in vitro genotoxicity, skin irritation, skin sensitization, or acute aquatic toxicity. A different conceptual framework of “adverse outcome pathways” (AOPs) (OECD, 2011) will be exploited to assist in the development of read-across for more complex endpoints such as repeated dose toxicity or reproductive/developmental toxicity. For the time being, the AOP for skin sensitization (OECD, 2012a) forms the prototypical AOP within the Toolbox. The AOP framework was described in the revised the OECD guidance on grouping of chemicals (2014b).