A Choral Strategy for (Q)SAR Models for Regulatory Purposes

Home / New Perspectives / Emerging Technologies / A Choral Strategy for (Q)SAR Models for Regulatory Purposes

Emerging Technologies

A Choral Strategy for (Q)SAR Models for Regulatory Purposes

By Emilio Benfenati, Istituto di Ricerche Farmacologiche “Mario Negri”

Published: December 6, 2007

About the Author(s)
Emilio Benfenati
Istituto di Ricerche Farmacologiche “Mario Negri”
Via La Masa 19
20156 Milan
E-mail: benfenati@marionegri.it
There is the need for better knowledge on chemical toxicological properties to protect human health and the environment, in general, and to face the challenges of legislation such as the REACH, in particular. Meeting this need requires unprecedented efforts for QSAR, which can be used for the specific purposes of chemical assessment.

There are some unique features, which characterise the desirable properties of QSAR models for regulatory purposes, and may not apply for QSAR models for other applications, such as models to be used by industry for development of a new product or drug. Considering the specific applications addressed by these models, we can list some aspects.

QSAR models for regulatory properties are numerous, and several targets are complex

QSAR scientists are addressing an ever wider series of endpoints for regulatory toxicology. For instance, REACH may ultimately stimulate QSAR research on all the many endpoints involved. For some of these endpoints, the existing data is definitely insufficient for meaningful QSAR modelling and for many other endpoints, there are problems in applying QSARs. However, in theory these shortcomings can be overcome and QSAR models could be useful in address all endpoints for REACH.

No single QSAR strategy will be sufficient to address all toxicological endpoints. For example, some models call for binary outcomes, such as a chemical being carcinogenic or not, or skin sensitizer or not. In other cases the classification will depend on the dose and result in different toxicity levels, such as low, medium, and high. In other cases the effect or property is expressed as a continuous value, such as the toxic dose.

We should consider that the same endpoint might require a continuum value or a class, depending on the regulation, or depending on the purpose of the evaluation. For instance, for classification and labelling purposes the class is suitable, while for risk assessment, when we have to evaluate an effect for a certain exposure concentration, a continuous value is the proper parameter, in order to compute the risk.

To obtain prediction as classes or continuum values, different algorithms are typically used. Also the way to describe the chemical structure may be different. For instance the presence of certain residues in the molecule have been used to identify carcinogenic compounds, while continuous chemical descriptors are typically used to predict continuous properties.

A further aspect related to the endpoint complexity is that for certain endpoints some QSAR models are used, while the same technique does not apply to other endpoints. For instance, for endocrine disruptors models, docking software are often used. These models have some advantages because they clearly relate to the knowledge on the binding of a ligand to the receptor, such as the estrogen receptor. In this case there is good knowledge on the structure of the receptor, and the mechanism is known, thus the predictions can be based on this. Of course, in many other cases such detailed knowledge on the receptor basis of the effect is not available, thus this approach cannot be used.

The consequences of such a complex picture, with many endpoints and models based on different approaches, is that it is not realistic to have a single approach. The models should be modulated considering these issues. Different approaches can be used for the same target and they improve the reliability of the overall prediction. In some cases they have been combined into a unified system, to improve overall results (1-3).

The models have to be robust and validated

This simple statement may appear obvious. QSAR models developed decades ago mainly addressed the discovery of some relationships between a given parameter and the effect of interest. For instance, it was satisfactory to find that there was a linear relationship between logP and aquatic toxicity. Furthermore, we can imagine a model aiming to understand the biochemistry of a given effect, in which all chemicals have known toxicity values, but we want to understand why. In this case the prediction is not for the property/effect, but for the process underlying the effect. Conversely, models to be used for regulatory purposes require stringent validation procedures on the prediction of the property/effect.

For regulatory purposes there has to be a proof that this relationship applies also for the prediction of the property of new chemicals; thus this has to be specifically addressed. This stipulation puts emphasis on the statistical validation of the model. This is clearly addressed within the five OECD principles for QSAR (4). New statistical tools and evaluation procedures have been introduced, compared to the simple fitting measurement based on the training set. The importance of an external test set has been stressed in many cases, and judged to be desirable. If the total number of compounds is low, this poses limits to the external validation. Internal validation is in any case recommended, and a battery of tools has been described.

Separate tools are used for models based on classifiers or regression methods. In case of classifiers the emphasis is on false positives and negatives. However, recently we underlined the importance of paying attention to false negatives also for regression models (3). Indeed, regulators pay much more attention to false negatives. Different strategies, such as addressing false positives, can be adopted by industry in the phase of new compound discovery. Even in case of models for regulatory purposes attention to false positives can be dedicated within a wider strategy, in which intelligent testing strategies are used, taking into account the sensitivity and selectivity properties of each individual element in the combined strategy.

There is a further issue to be considered, relative to the aspect or model robustness. A model for regulatory purposes should adhere as much as possible to regulation. This fact has several consequences (3). For instance the endpoints have to be those identified in the regulation, and the protocol to get the experimental values has to be that specified in the official technical guidance identified by the regulation. A further aspect is that there should be a check of the input values. In case of toxicity values some databases are good, other less so. We compared different official databases on pesticides and found differences among reported values (3). Worst is the case of data taken from the literature. But this aspect is not limited to the property values. We also found many mistakes in the chemical structures reported in journals. All these checks require time and effort, and are not typically done when a QSAR model is applied for academic research. Conversely, in case of a model to be proposed for regulatory purposes, efforts should be dedicated to the quality check of the data. Some researchers have clearly identified this problem and dedicated activities to increase the quality of the toxicity data available. This is the case of Ann Richard, and the DSSTox database (5).

The models have to be transparent

This issue is also addressed in the OECD principles. Regulators cannot rely on something which cannot be verified. This refers to all model components. Thus, data for the training set should be known. The data reporting a property, such as toxicity or binding to a receptor, are the basis of any QSAR model. If the data are unknown, the model itself can be criticised. Let’s imagine having two independent models, each based on unknown data. If the two models give conflicting results it would be impossible to assess the relevance of the data, and thus both models will appear as not reliable. In principle, using confidential data it would be possible to introduce biased values, to produce results favourable to a certain industry. Thus, full transparency is the basis of the credibility of any model to be used for official purposes. Similar consideration applies to the other components of the model, such as the algorithm, the chemical descriptors, etc. We notice here that unfortunately chemical descriptors obtained by the different software are not always transparent, because the exact equation to get them is not always given. Transparency should refer also on the access rights and ownership for the different model components.

The models have to be reproducible

The model should give the same result when used by different users, in different locations. This fact may lead to a preference towards easier models. Indeed, some more complex models typically require manual optimisations, which are done by skilled operators. This introduces uncertainty in the result, which usually is not critical. In other cases some stochastic procedures are introduced, but they are done in the model development (for instance use of genetic algorithms for descriptor selection), so they may not affect the final result.

Related to reproducibility is the fact that the parameters of the model algorithm and the descriptors should be fixed. This requires a clear codification of the algorithms and descriptors. Algorithms, even those apparently complex, can be codified in a way that they result in a series of coefficients and variables. In the development phase complex tools can be used, such as genetic algorithms, to find best descriptors and neural networks for model optimisation. Then, once the model is defined, it can be implemented as a relatively simple equation, as has been done for instance within the DEMETRA models, in which complex hybrid models resulted in much simpler equations with certain descriptors (3, 6). An important aspect is that descriptors should be fixed as well. This can be critical, because typically descriptors are calculated with software, which are proprietary or otherwise out of the control of the QSAR modeller. The software developers of the chemical descriptors quite frequently change the algorithms in the different versions, and this may bring results not reproducible.

An ideal framework for the development of QSAR for regulatory purposes

We have seen models with peculiar features, which delay progress on the development of QSAR models suitable for regulatory purposes. A huge choral effort has to be done to produce models suitable for this purpose. Scientists should avoid the attitude to keep restricted some parts of the QSAR models. Of course, contributions to each part of the model should be acknowledged. Industry can play a major role by making data accessible. Indeed, industry is the main source of data, and it should discuss internally a strategy on access to data which maintain the ownership of the data for sensitive values. Regulators should promote initiatives towards a larger dissemination and use of data and models.

On the basis of the above considerations, the QSAR community should increase the synergies to better cope with the task. As we have seen, no single approach can solve all problems. Furthermore, there will surely be new tools, which continuously are introduced. We should take advantage of different tools in this attempt to obtain validated models. The ideal situation should be making freely available all model components. In this way it can be possible to get full advantage of each component and adopt different strategies, combining different models, for instance. If the different models are proprietary, the powerful integration of the different components will be much more difficult.

Furthermore, a private company providing any part of the model can for any reason modify or cease making available such part, and this will cause a black-out of the full model.

Property data are crucial for further development of QSAR. We already mentioned some helpful activities to make data available to the widest community. The DSSTox, ECOTOX, and AMBIT databases are some examples. An interesting feature is the availability of toxicity and chemical data/structures together. XML has been identified as the preferable standard within the information technology community for this; see for instance the mentioned DSSTox database (5) and the EC projects CHEMOMENTUM (7), DEMETRA (6) and OpenMolGRID (8). The possibility to exchange data between different databases is also important.

The availability of data from different sources will provide a way to compare and integrate data. It is important to have access to multiple values for the same chemical, and also to know the uncertainty related to a given endpoint. The uncertainty typical of a given endpoint, using the same protocol, should be characterised, because it affects the uncertainty of the QSAR model (3). The uncertainty of the final model cannot be superior to the uncertainty of the input data, and it is suspect to see values predicted with a precision superior to that of the experimental laboratory model.

We already mentioned that models should be fully transparent, including data. It is expected that data will be more and more available publicly, and present in several sources, such as the IUCLID database, thanks to the REACH legislation, the OECD toolbox, etc. This process will attract further data. However, there will be other data not publicly available, for obvious reasons, such as confidential data on chemicals under development. Companies with confidential data may have a use for QSAR models as test sets.

Ideally, chemical structures of the substance used in the QSAR models for regulatory purposes should be carefully checked. Manual check is still the preferable way, but it poses problems in the case of data sets of thousands of chemicals.

Several chemical formats are currently used in QSAR models. When a transformation from one format to a second is done, the chemical structure should be checked, because it is possible to introduce modifications in the chemical structures. Care should be taken considering the exact chemical structure suitable to describe the correct stereochemistry of the chemical used in the laboratory experiment, when the chemical has chiral centers.

Ideally, chemical descriptors should come with their correct description, including the mathematical equation/way to get them. We mentioned that the change of software versions to calculate descriptors might be a problem. This can be solved producing freely available tools to calculate chemical descriptors, and leaving available the old version, when a new one is done. When a new version of the software is used to calculate the chemical descriptors, the reproducibility of the results should be checked. More reproducible descriptors are preferable. However, this point is not so easy, because even 2D descriptors may give different results, on the basis of tautomerism, for instance, and there are several programs for logP, which produce different results (9).

Ideally, all mathematical parameters of the model algorithm should be known. Also for model algorithm XML is the preferable format. In the ideal situation, within a unified XML system, the user can import toxicity data from one source, get the structure from a second source, apply the model and obtain the results, in a seamless way, even referring to different sources. Some projects addressed this goal (7, 8).

We described above what can be an ideal situation. Keeping in mind the goal of having a wide availability of tools to improve the safety of industrial chemicals for regulatory purposes, there are several features, which have been discussed above, that address to the OECD principles. There are some advantages to having model components freely available. The main advantages are:

  • Wider use of the QSAR models by regulators, industry, non-governmental organisations, and scientists. The free availability can of course make the tool applicable regardless of a chemical’s tonnage, improving the societal protection. Nowadays, due to the cost of the assessment, for a chemical on the market in amount lower than a tonne, the REACH provisions are not applicable, and only a minimum set of evaluations is done for chemicals produced between one and ten tonnes, which is the category with the highest number of chemicals. The cost for the models will represent a barrier.
  • Better possibility to integrate different tools, making optimal exploitation of any specific model component. Nowadays it may happen that a certain model is powerful because it employs more data, yet the algorithm itself may be poor. Or in another case the algorithm is powerful, but the chemical descriptors poor. In the ideal situation, scientists may combine and test different components, achieving better results.
  • The whole process will be more transparent, convincing more users. Commercial components typically are restricted.
  • The process will be more reproducible. This provides some reassurance that a crucial part of the model does not disappear from the marketplace.

Obviously other solutions are present and forthcoming, involving commercial solutions. Commercial companies may offer more user-friendly environment, assistance, and dedicated solutions for specific problems of industrial interest. For instance, a chemical company may wish to explore a large number of compounds, using some confidential data it has. In this case focussed models can be built for the specific target, using powerful commercial models.

There will likely be a further increase of the activities in the QSAR field. A wider acceptance of QSAR tools, resulting from a robust and powerful framework embracing QSAR models for regulatory purposes, will promote the use of these models for many other applications, with benefits for commercial groups developing QSAR models.
©2007 Emilio Benfenati

  1. Lee, Y., Buchanan, B.G., Mattison, D.M., Klopman, G. & Rosenkranz, H.S. (1995). Learning rules to predict rodent carcinogenicity of non-genotoxic chemicals. Mutation Research.328, 29-46.
  2. Benfenati, E., Mazzatorta, P., Neagu, D. & Gini, G. (2002).Combining Classifiers of Pesticides Toxicity through a Neuro-fuzzy Approach, in: F. Roli, J. Kittler, Eds. MCS2002, Multiple Classifier Systems, 293-303. Lecture notes in Computer science, vol 2364, Springler, Berlin.
  3. Benfenati, E. (Editor). (2007). Quantitative structure-activity relationships (QSAR) for pesticide regulatory purposes. Elsevier, Amsterdam, The Netherlands. 510pp.
  4. OECD. (2004).The Report from the Expert Group on (Quantitative) StructureActivity Relationships [(Q)SARs] on the Principles for the Validation of (Q)SARs. ENV/JM/TG(2004)27/REV. Organisation for Economic Cooperation and Development, Paris, France. 17pp.
  5. Distributed Structure-Searchable Toxicity (DSSTox) Public Database Network
  6. Demetra
  7. Projekt Chemomentum
  8. OpenMolGRID
  9. Benfenati, E., Gini, G., Piclin, N., Roncaglioni, A. & Varì, M.R. (2003). Predicting LOGP of pesticides using different software. Chemosphere. 53, 1155-1164.