Replacement Alternative Methods and the Need for New Approaches to Risk Assessment in Relation to Chemicals and Chemical Products
Published: December 6, 2007
Michael Balls, CBE, MA, DPhil, FIBiol FRAME Russell & Burch House 96–98 North Sherwood Street Nottingham NG1 4EE UK E-mail: email@example.com
Robert Combes gained his first degree and a PhD in Genetics at Queen Mary College, University of London, then spent nearly 20 years as a teacher in the School of Biological Sciences of what is now the University of Portsmouth. He was Head of Mutagenicity and Cellular Toxicology at Inveresk International, Tranent, Scotland, from 1988 to 1992, when he became Head of Biological Sciences at the La Sainte Union College of Higher Education in the University of Southampton. In 1994, he moved to the Fund for the Replacement of Animals in Medical Experiments (FRAME), Nottingham, as Scientific Director, becoming Director in 1994. He serves on many committees, working parties, task forces and editorial boards, and is the author or co-author of about 300 scientific research articles and reports. Professor Combes is an Honorary Professor in the University of Nottingham, has served as the President of the European Society for Toxicology in Vitro, and is a member of the ECVAM Scientific Advisory Committee, a Director of LHASA (UK) Ltd, and Associate Editor of ATLA (Alternatives to Laboratory Animals).
Robert D. Combes, BSc, PhD, FIBiol FRAME Russell & Burch House 96–98 North Sherwood Street Nottingham NG1 4EE UK E-mail: firstname.lastname@example.org
The outcome of discussions at a FRAME workshop on Possibilities for a New Approach to Chemicals Risk Assessment (ATLA 34, 621–649, 2006) are summarised, and the conclusions and recommendations of the participants are listed. Bearing in mind the mounting scientific, economic and ethical pressures to reduce the reliance of toxicity testing on laboratory animal procedures, ways in which non-animal hazard classification data could be used to great advantage in risk assessment are discussed, including the equivalent exposure approach to in vitro–in vivo extrapolation, the use of adjustment factors for in vitro data, and in vitro biokinetic modelling.
adjustment factors, alternative methods, animal tests, biokinetic modelling, equivalent exposure approach, equivalent external dose concept, hazard prediction, in vitro, in silico, REACH system, replacement, risk assessment, threshold dose
Increasing attention is being focused on the development, validation and acceptance of non-animal methods which could progressively reduce, and eventually eliminate, the current reliance on traditional laboratory animal procedures in the efficacy and toxicity testing of chemicals and various types of chemical products. These “replacement alternative” methods include the use of lower organisms, the early embryonic stages of vertebrates, isolated in vitro sub-cellular, cell and tissue preparations, and cell, tissue and organ cultures, computer-based (in silico) models, and even the ethical use of human volunteers. It is important to distinguish between direct replacement, in which a method provides more or less the same information as would have been provided by an animal experiment, and indirect replacement, in which the information provided is different in kind, but can be employed for a purpose similar to that for which animal data would have been used, such as predicting potential hazard and evaluating the consequences of exposure to a chemical or product. The opportunities for direct replacement will be limited, and this could be seen as advantageous, since laboratory animal procedures are inherently incapable of providing information of unequivocal relevance to humans, because of the inescapable complications introduced by species differences. Thus, what are now seen as “replacements” will progressively become the methods of choice, especially as more comes to be known about the cellular and molecular mechanisms involved in physiological, pharmacological and toxicological processes involved in the consequences to humans of exposure to potentially useful or hazardous chemicals and products. This, in turn, means that new approaches will be necessary, in terms both of hazard identification and characterisation and of risk assessment. This has been known for some time, at least in some quarters, but there also seems to have been a widespread assumption that non-animal data could be used to substitute directly for animal data in the application of traditional approaches to hazard identification and risk assessment. In December 2005, FRAME organised a workshop to discuss these issues, with an emphasis on how maximum use could be made of advanced, non-animal approaches in improving the risk assessment process in terms of its science, relevance and efficiency. Our aim in summarising the workshop report in this article, is to use the new forum provided by AltTox.org to encourage others to read the full report (1) and to contribute to what we consider to be a debate of the greatest importance.
The Principles of Risk Assessment
Hazard, exposure and risk
Hazard is the inherent property of an agent or situation to have the potential to cause adverse effects when an organism, system, population or sub-population is exposed to that agent or situation. It is therefore the outcome of the interaction between the agent or situation and the biological system. Risk is a statistical parameter, being the probability that an adverse effect in an organism, system or (sub)population is caused under specified circumstances by exposure to an agent. Risk assessment is the process whereby risks to a given target organism, system, population or sub-population are measured or estimated. Assessing risks to humans from exposure to chemicals currently involves the following main steps: 1) hazard identification (HI) involves determining the potential adverse effects of a chemical in animals, man or the environment; 2) hazard characterisation (HC) is the evaluation of the quantitative relationship between exposure (level, duration, frequency, route) and the nature, severity and incidence of toxic effects; 3) exposure assessment (EA) is the measurement, estimation or prediction of exposure to a substance in terms of magnitude, duration and frequency for the population of interest; and 4) risk characterisation (RC) is the integration of the outputs of hazard assessment and exposure assessment, in order to determine whether or not the potential adverse effects in the exposed population are likely to occur. Thus, tests for HI identify a potential for toxicity, whereas those for HC provide more-definite evidence about the circumstances under which the toxicity could be expressed in vivo. Tests for HC may provide information on relative potency, dose thresholds, and saturation levels, from qualitative and/or quantitative information. RC takes two forms. In one form, the RC uses the HC and EA to determine a risk estimate (usually in the form of a “margin of exposure”), which is then evaluated with respect to its acceptability in the risk evaluation stage of risk management. In the second form, the “uncertainty factors” approach, the RC takes a series of given assumptions from a standardised risk evaluation procedure, and determines a maximum acceptable exposure (e.g. the “reference dose”) allowed by the assumptions. The first form of RC process is capable of refinement, either by obtaining more HI data through toxicity testing or through the better characterisation of exposure. The second form is a standardised procedure that can only deal with hazard (toxicity) information, and pays no attention to the use of exposure information to drive the question of whether the hazard (toxicity) information is necessary for the specific chemical and circumstance.
The threshold dose concept
The usual approach to chemical risk assessment commences with defining threshold doses for toxicity in the most sensitive animal test system that has been used for HI. The most commonly used measure of toxicity threshold is the No Observed Adverse Effect Level (NOAEL). This is the maximum tested dose level of the test substance that failed to induce changes which are considered to be adverse. Other parameters can sometimes be used, including expressions incorporating concentration and duration of exposure as the measure of the dose, the Lowest Observed Adverse Effect Level (LOAEL), and the Benchmark Dose (BMD). It is important to note that the threshold dose concept is not considered to apply to stochastic effects, such as carcinogenesis. Thus, for risk assessment, chemicals are separated into two discrete groups — those considered to have a threshold dose and those considered not to have a threshold dose. Irritancy, acute toxicity, organ-specific toxicity, system-specific toxicity, reproductive toxicity, and non-genotoxic carcinogenicity are considered to involve threshold effects, whereas genotoxicity, mutagenicity and genotoxic carcinogenicity are well-established examples of non-threshold effects. A threshold dose is a level of exposure below which no toxic effect is expected to occur. In the case of non-threshold effects, it is considered that any level of exposure is associated with an increased risk, albeit a small one. For the risk assessment of non-threshold effects, the complete dose–response data are fitted with a mathematical function that is used to extrapolate the risk of exposure levels relevant for humans, which are often orders of magnitude lower than the exposure levels used in animal toxicity testing.
Adjusting the threshold dose
Three major assumptions are made when undertaking risk assessment based on animal data: 1) the mechanism of toxicity in the test species is relevant to that in humans; 2) the human population covered by the evaluation is at least as susceptible to the same exposure as the most sensitive animal test species; and 3) the variation in the (heterogeneous) human population covered by the assessment is likely to be greater than that in the homogeneous test population (especially if it was a group of animals from an inbred strain of a particular species). These assumptions have led to a practice in which the threshold dose is adjusted by using factors that attempt to quantify the amount of uncertainty in extrapolating the test data to the target species. The correction of the NOAEL is achieved by using a parameter known variously as an application factor, an adjustment factor, a safety factor, or an uncertainty factor. Although “uncertainty factor” is perhaps the most appropriate phrase, as it more accurately reflects the fact that the adjustment takes account of uncertainties in extrapolation, “adjustment factor” was used in the workshop report. The uncertainties of concern are due to errors in extrapolating quantitative data from animal tests to the target species, and include, where humans are the target species: 1) species differences; 2) the dosing of small, relatively homogenous groups compared with the exposure of larger, more heterogeneous populations; and 3) high, short-term dosing compared with low dose, lifetime exposure. Since the 1950s, an estimate of safety from exposure to chemicals has been obtained by dividing the NOAEL by an overall default value of 100 (14, 15). This represents a post hoc rationalisation of a pre-existing pragmatic set of numbers (as is summarised in Appendix 2 of the workshop report), rather than a truly scientific approach to the problem. It should be noted that, in the case of environmental risk assessment, it is not unusual to adjust Predicted No Effect Concentration (PNEC) values by 1000-fold factors, reflecting the even greater uncertainty in interpreting such data. The reference dose (RfD) has been defined as an estimate of the daily exposure dose that is likely to lack deleterious effects, even if continued exposure were to occur over a lifetime. It is the result of the application of adjustment factors to the NOAEL (RfD = NOAEL or LOAEL or BMD divided by the adjustment factors). Inevitably, arriving at an RfD is subject to human judgement and societal considerations. The exposure of the target organism (e.g. a human consumer or worker) is assessed from a range of information. Although, ideally, exposure levels in specific situations should be measured experimentally, models often have to be used, which can be developed by using information on: 1) the annual production level of a product; 2) the amount of the ingredient in the product; 3) the bioavailability of the ingredient; 4) the pattern of use of the product; 5) any containment measures (e.g. packaging); and 6) the availability of protective clothing. In addition, both normal/intended routes of exposure and accidental/foreseeable misuse are considered. The exposure level in the target organism (usually the maximum likely) is compared with the NOAEL in the test animal, with or without correction by applying adjustment factors. If the comparison is conducted without adjustment, the result is the “margin of exposure”. Where the expected target organism exposure is below the RfD, exposure to the substance under the conditions applying might be considered to pose no risk, whereas when it is above the RfD, the risk might be considered “tolerable” or “unacceptable”, according to the specific circumstances. The tolerability of the risk depends on a societal judgement concerning risk and benefit, and the circumstances surrounding the exposure. In practice, other information is also used in these risk evaluations, including: 1) the quantitative difference between the exposure and the NOAEL; 2) the mechanism(s) of toxicity; 3) the nature of the most sensitive toxicity endpoint used to obtain the NOAEL; 4) the route of administration used in the animal test; and 5) the possibility of limiting exposure (control options). It is the ratio between exposure and NOAEL that is the margin of exposure. The final step of risk characterisation is to consider whether this margin is sufficiently large to consider the situation to be acceptable. The RfD approach includes a risk evaluation based on the application of adjustment factors. This preliminary risk evaluation is assumed, probably incorrectly, to be universally applicable. The procedures rely on adjustment factors that make use of pragmatic assumptions concerning default values. Many of these assumptions are probably conservative. The procedures also include assumptions concerning the universality of the extrapolations, despite known population differences in toxicokinetics and toxicodynamics. Any risk assessment can only address known risks, and is limited by the information obtainable. There will always be a residual risk due to the limitations of the information available. This residual risk consists of two components: a) known unknowns; and b) unknown unknowns. An example of the former is the limitation of the test in accurately characterising the toxicity under investigation or the uncertainties in extrapolation. Unknown unknowns would include novel forms of toxicity not hitherto encountered. This is the reason why studies monitoring human populations and/or post-marketing surveillance are essential.
The Need for a New Paradigm for Risk Assessment
The criteria for good risk assessment
The theory behind risk assessment, and the way it is implemented in practice, are poorly understood by toxicologists, industry and the public at large. Exposure and dose–response information need to be as accurate as possible, if meaningful risk assessments are to be obtained. It is important to minimise errors at any stage, so as to be able to reduce the probability of either exposure to a substance at levels that will actually represent a high risk, or the unjustified banning of a substance that is, in reality, of low risk in the exposure scenario in which it is used. The latter error is no problem to the regulator, but the former is a regulator’s nightmare, and encourages the development of “conservative” attitudes.
Recently, the publication of the European Commission’s REACH proposals, which require the chemical production and downstream user industries to assess the risks posed by some 30,000 existing chemicals in different exposure situations, has produced a huge challenge for industry, toxicologists and animal welfare advocates alike, to find new ways to undertake risk assessments. Existing regulations for biocides, and the High Production Volume testing programme in the USA, are also providing substantial challenges to the processes for assessing the risks resulting from exposure to chemicals. The problems with the traditional approaches to safety testing have also been allowed to become acute in the case of the development of new pharmaceuticals, as was indicated in a recent report by the US Food & Drug Administration (FDA). The FDA document stresses an urgent need for new efficacy and testing strategies, in order to keep pace with the huge increase in the development of new chemical entities of pharmacological interest and importance. Thus, the problems referred to in the FRAME workshop report are by no means confined to the products of the chemical industry.
As outlined above, many problems are associated with obtaining appropriate data for the risk assessment of substances with respect to the target species. In traditional approaches to risk analysis, quantitative hazard data (derived from dose–response studies in animals) are used, with adjustment factors, to identify threshold dose levels for safety and to lay down maximum daily exposure limits. Extrapolation is further confounded by the fact that the routes of exposure used in tests might be very different to those that occur when humans come into contact with potentially toxic materials, which necessitates route-to-route extrapolation. Furthermore, the duration and levels of exposure, as well as the numbers of repeat exposures, can differ greatly between experimental systems and actual exposures in real-life situations. This particularly applies when attempts are made to model lifetime effects in humans with chronic toxicity studies in rodents. In addition, despite the increasing use of limit doses, in vivo toxicity tests can involve very high dose levels that are unrealistic, being far greater than could be experienced under anticipated and realistic human exposure conditions. The meaningful interpretation of toxicity seen at such high doses is highly problematical. It is most important to realise that a NOAEL value depends on two factors: 1) a judgement about the statistical significance and biological importance of any “effect” seen; and 2) the study design (especially the spacing between doses, the variability among the animals on test, and the numbers of parameters measured). These factors determine the accuracy of the dose–response data obtained. Thus, decisions concerning the NOAEL are based on subjectivity, even though they might be taken by experts. Furthermore, it should be noted that it is traditional, in the absence of metabolic information on which to judge relevance, to use data from the most sensitive toxicity test (with respect to endpoint and/or species used). This, of course, discourages intelligent testing and the use of integrated testing schemes, but instead, encourages “tick-box” testing, due to the need to undertake a range of in vivo tests. In reality, there is no objective biological justification for adopting an overall factor of 100 as the default adjustment factor. Many problems are associated with the use of such factors, so the level of adjustment necessary will actually depend on numerous variables, including the test method, the species, the route of application, the test substance, the way the animals are handled, fed, and housed, and the relationship between the endpoints measured and the toxicity being predicted in humans. Also, their use takes no account of the dose–response relationship and assumes the presence of a threshold dose. In addition, the factors are often altered crudely and in a subjective way, to take account of special circumstances, such as: 1) when there are additional uncertainties in the data (equivocal results or data gaps, such as the lack of a chronic study); 2) when specific chemicals are tested; and 3) when the critical endpoint is particularly serious (such as teratogenicity or developmental neurotoxicity).
Too much focus on hazard identification
It is time that the merit of the traditional risk assessment paradigm was seriously questioned, since it relies primarily on the use of often imprecise hazard data from animal tests, and on incomplete exposure information, together with arbitrary adjustment factors. The time-scale and the need for assessing the risks of exposure of humans and the environment to existing chemicals are such that the conventional approach to risk assessment needs to be supplemented with, and eventually replaced by, new methods based on rapid, sensitive and relevant non-animal approaches. However, up to now, data from most of the alternatives that are available either for in-house compound prioritisation or for regulatory use, are considered to be suitable mainly for hazard identification, but are rarely, if ever, used quantitatively for hazard classification. These assumptions, and this tradition, need to be re-examined, so that the possibilities for using in vitro toxicity test data to their maximum potential for hazard classification can be explored. For example, this could include the identification of adjustment factors that, by analogy with animal testing, could be used to extrapolate from in vitro tissue culture to the target species (e.g. the human being). Such extrapolations might not be any less precise than those made, for example, from small groups of rats to human populations, particularly when accurate and consistent concentration–response data can be derived from human cells in culture.
The Potential for Developing New Approaches to Risk Assessment
Requirements of tests for RA
Many in vitro tests have been designed and are appropriate for hazard identification, including many of the methods that have been recently validated for regulatory use. Ideally, for a test to be suitable for hazard identification and hazard classification, it should: 1) be mechanistically-based with a biologically plausible relationship between the endpoint being predicted and the phenomenon being modelled; 2) have been validated with reference to the target species in question; 3) have a relevant endpoint that occurs in the target species of interest; and 4) have a prediction model that is readily related to toxicity in the target species. For a test to also be useful for risk assessment, it should ideally have a number of additional properties, including the facility to: 1) model routes of exposure relevant to those experienced by the target species (although it is possible to extrapolate data between routes by using PBPK modelling); 2) produce quantitative data from concentration–response information relevant to target species exposure (e.g. involving human receptors and metabolising enzymes); 3) provide dose–response curves from which potency estimations can be derived; and 4) identify threshold dose levels and plateau effects that can be extrapolated to target species exposure scenarios, and that can be used to clearly define NOAEL values. It will probably be necessary to use a battery of such tests, in order to cover the variety of mechanisms involved in any toxicity endpoint. There is a widely held belief among toxicologists that in vitro tests cannot provide quantitative data which are useful for risk assessment purposes. Many of them consider that the quantitative data from non-animal methods cannot be used to define threshold doses and toxic potency in vivo, due to perceived difficulties in predicting in vivo effects from those detected with cells in tissue culture. For example, several toxicologists and a number of scientific committees, responding to calls for the increased use of non-animal methods to satisfy the demands of the REACH legislation, have insisted that risk assessments can only be undertaken with animal data. The FRAME workshop participants did not concur with this opinion.
Identifying mechanisms of toxicity
Knowledge about mechanisms of toxicity is important for making informed assessments with regard to particular hazards. For example, such information can be useful for identifying whether any thresholds for toxicity are likely to exist (e.g. receptor-based versus reactive molecular interactions, such as genotoxic, as opposed to non-genotoxic, carcinogenicity), and to decide whether toxicity is reversible (e.g. inherently so, or via the activity of repair enzymes or immune responses). In vitro assays are particularly useful for investigating mechanisms of toxicity, including: 1) the existence of any synergistic or antagonistic effects between the various components of a product; 2) the contribution of any impurities to toxicity; 3) the contribution of any metabolites to toxicity, and the ability of specific enzymes to activate or detoxify a substance; 4) structure/activity relationships; 5) species/organ specificity; 6) receptor-binding affinity (agonistic/antagonistic activity); and 7) dose–response relationships (potency, threshold dose comparisons). Points 1) and 2) are particularly relevant for assessing the toxicities of products, and information about metabolism can be crucial for deciding whether toxicity will be manifested in humans.
Advantages of using in vitro data
There are several good reasons why non-animal data could become more widely used as an input for quantitative risk assessment in the future, including: 1) the increased use of human cells of various types; 2) the improved maintenance of differentiated cells in culture for long periods; 3) the use of genetically-engineered cells with useful and well-defined characteristics; 4) the development of tissue engineering approaches and complex organotypic cell systems; 5) the development of co-cultures of different cell types; and 6) techniques for long-term culture, repeat dosing, and the assessment of recovery. A wide range of different cell types can be cultured, including those from a wide variety of tissues and from several different species. This is very useful, because it enables some measures of target organ and species–specific toxicity to be studied in culture. If human cells can be used, this can minimise problems with inter-species extrapolation.
Other characteristics of in vitro data that need to be considered
In vitro systems lack fully-formed epithelial barriers, as well as a circulatory system for delivering toxic substances to target cells and for eliminating waste products, including test materials. and metabolites. However, it is possible to develop tissue culture models capable of exhibiting satisfactory barrier functions, including the blood–brain barrier, and the blood–testis barrier. The lack of metabolism can to some extent be overcome by adding exogenous metabolising fractions or by using metabolism-competent cells. By their very nature, in vitro tissue cultures also lack intact immune, endocrine and nervous systems. While this can be a great advantage when investigating mechanisms of toxicity, it compromises the modelling of whole body responses to potentially toxic substances, and the detection of complex phenomena or processes, such as behavioural effects. The use of tissue cultures is also constrained by the difficulty of obtaining certain cells and tissues, and of maintaining them in a fully-differentiated state. This means that the use of many important types of human cells and tissues is often constrained by their low availability, as well as by safety and ethical issues. However, there have been substantial advances in tissue culture methodology, and a number of these problems are gradually being overcome, so the culture of many highly differentiated cells of important target organs for toxicity is now feasible. Also, the development of human tissue banks is increasing the availability of human cells and tissues.
The use of data provided by the new technologies
Several relatively new technologies, including genomics, transcriptomics, proteomics and metabonomics, are increasingly being used in toxicology. Genomics is the study of the complete gene complement of an organism, cell or organelle, whereas transcriptomics refers to the study of the full complement of activated genes in a specific tissue at a particular time. Proteomics is the study of the entire protein complement expressed by a genome, while metabonomics involves measuring low molecular weight metabolites in a cell at a particular time, and under specific environmental conditions. The use of this collection of techniques is based on the principle that toxicity can affect patterns of gene expression in cells and in tissues. After exposure to a test substance, patterns of gene expression at either the transcriptional (genomic) level or the translational (proteomic) level are compared with those detected after exposure to known toxicants. Differential transcription is measured by using microarrays of oligonucleotides, which detect cDNA and cRNA copies of new transcripts extracted from cells. Specific hybridisation to the oligonucleotides is visualised by scanning under laser light, resulting in several thousand measurements of gene expression. Highly specific panels of oligonucleotides are used to enable tissue-dependent temporal and spatial patterns of gene expression to be identified. Proteomics involves the analysis of total cellular protein. The methods for detecting and quantifying proteins have recently been greatly improved, and the use of this approach is likely to increase markedly. However, it is probable that proteomics will have only medium throughput applications. Also, data interpretation is complicated by: 1) molecular sieving; 2) protein retardation; 3) the folding of protein complexes; 4) post-translational modification; 5) inherent analytical detection limits; 6) pI (isoelectric focus point) limits; and 7) the current high levels of intra-laboratory and inter-laboratory variability. In addition, there is a lack of internal standards and validated analytical tools. Metabolic profiling is based on the premise that a toxic effect is related to a change in the small molecule (e.g. metabolite) contents of cell and body fluids. This technique has the advantage that it can be non-invasive, and can be used with body fluids from volunteers, in, for example, the identification of useful biomarkers of exposure and effect. A key problem in deploying these technologies for use in risk assessment is the complexity of data analysis and interpretation. This is because huge amounts of information are generated and the methodology is highly sensitive to small changes in environmental conditions. This is a particular issue with the analysis of global patterns of gene expression, and has prompted the development of new disciplines to alleviate the problem, such as bioinformatics, which involves the use of on-line access to complex databases. The validation of these techniques for their regulatory applicability will need careful consideration. In the short term, it is likely that toxicogenomic approaches will provide more-useful data when they are focused on specific areas of toxicity, for which the mechanisms of action of toxic chemicals are well-characterised, and for which there are microarrays that target the likely genes involved. A good example of this is sensitisation testing. Nevertheless, the longer-term prospects of obtaining useful information from these technologies are very good, particularly from the combined use of microarrays and global protein analysis. Moreover, the detection of the induction of specific patterns of gene expression could facilitate the classification of agents with different mechanisms of action. How these technologies will be used as a basis for risk assessment will emerge in the future, as more experience with them is gained.
Extrapolating from In Vitro to In Vivo
The principal ways of enhancing the relevance of in vitro test data are to use: 1) cells from the target organism (e.g. human cells or fish cells); 2) generic cells in culture to determine the susceptibility of fundamental cellular processes to test chemicals; 3) cells from target tissues; 4) metabolising systems from target tissues; and 5) test substance concentrations adjusted for levels predicted to arise at target sites in vivo. For example, the GM-CFU assay, which has been validated and recently endorsed for predicting acute neutropenia, is based on this approach. This is an in vitro assay, based on measuring the toxicity of chemicals to granulocyte/macrophage differentiation in tissue culture, which enables the maximum tolerated dose (MTD) in humans for haematotoxicity (e.g. of anti-cancer agents) to be determined by employing an algorithm which involves using the relative cytotoxicities of the test chemical to human and mouse GM cells and the MTD in whole mice. Biokinetic modelling and the prediction of metabolism can play crucial roles in providing data to inform the selection of potential target tissues and relevant test substance concentrations, by relating the concentration of test compound applied in the in vitro system to the internal dose in the in vivo system being modelled. For example, in vitro test protocols for mutagenicity testing have sometimes been improved, in order to reduce the incidence of false positive data and to increase their specificity. Such improvements include the use of altered biotransformation systems (e.g. enzyme fractions from relevant target organs, other than liver S9, for phase 2 biotransformation, or whole cell biotransformation systems, to improve the balance between activation and detoxification). A further important consideration to be taken into account, is the difference in the inherent behaviour and responses of cells when they are outside the body in tissue culture, as opposed to when they are in their normal in vivo locations in target tissues. There could be many reasons for these differences, including the fact that, when cultured in vitro, the cells will variously have altered internal and external structures, cytoskeletons, membrane transporters, attachments, and communications with other cells and substrata. In addition, there will often be an absence of some cell types, or changes in the relative proportions of the different cell types, compared with the situation in vivo. The use of organotypic or reconstructed tissue systems can reduce the extent of these problems, but it is not easy to imagine a general approach to estimating the differences or similarities between an in vitro model and the situation in vivo, so this issue has to be approached on a case-by-case basis.
Adjustment factors for in vitro data
Since adjustment factors are applied to the interpretation of animal test data when they are to be used in risk assessment, the use of such factors should, in principle, be no less acceptable when interpreting in vitro data for risk assessment purposes, particularly where such data are considered to be relevant for hazard classification. Ideally, new risk assessment paradigms should aim to reduce the uncertainties inherent in the process. However, the term “adjustment factor” should not be applied in the same way to in vivo and in vitro data, since different parameters should be applied (a tentative suggestion as to how this could be achieved is made in Appendix 3 of the workshop report). Reducing the amount of adjustment required for in vitro data is possible. For example, differences in metabolism could be minimised by the careful selection of metabolising systems to be used in vitro, or via the testing of known or suspected principal metabolites. Differences in dose can be reduced by using in vitro and in vivo biokinetic modelling to determine internal and bioavailable concentrations/doses, respectively. However, minimising the intrinsic differences in the responsiveness of cells in vitro and in vivo, and determining the correct nature of the critical toxic events, are probably two of the most difficult issues to be tackled. The most straightforward way to address this problem is to use human target cells or tissues in organotypic co-culture systems, but: 1) this is not always possible; and 2) some fundamental and unavoidable differences will always remain between responses in vivo and in vitro. It might be possible to ascertain the nature and extent of some of these differences by using toxicogenomic information. This could be obtained by applying recently-developed techniques such as cytomics and telomics, which link genomics, proteomics and metabonomics with the dynamics of cell and tissue function and to specific mechanisms of toxicity.
In vivo biokinetic modelling, and in particular, physiologically-based pharmacokinetic (PBPK) modelling, has the potential for improving the predictive value of in vivo toxicity data and reducing the uncertainty of risk assessments based on such information. The aim of using PBPK modelling in risk assessment is to provide a target organ dose (internal dose) instead of the dose level applied to the animals (the external dose). PBPK modelling involves making good estimates of the internal organ and tissue concentrations of externally-applied doses in whole animals, by mathematical modelling of the animal body in ways which are specific for each species and route of administration/exposure. A PBPK model is an independent structural model, comprising the tissues and organs of the body, each connected via, and perfused by, the blood circulatory system. The generation of such a model requires the species-specific anatomical, physiological, biochemical and physicochemical information, including, for example: 1) respiration rate; 2) cardiac output; 3) organ weights and blood perfusion rates; 4) rates of resorption, metabolism and elimination of xenobiotics; and 5) the distribution of the test chemical between the blood, organs and tissues. Traditionally, PBPK modelling has not been widely used, in view of its perceived mathematical intricacy and the complexity of the data required for a variety of parameters, which render it both demanding and time-consuming. However, these problems are being addressed by the Health & Safety Laboratory (HSL) in the UK, via the development of: 1) an electronic database containing all the anatomical, physiological, biochemical and physicochemical parameters required to build a PBPK model; 2) a PBPK model equation generator (MEG); and 3) access to advanced statistical and sensitivity analyses of PBPK model output. (Click here for further information.) In the HSL system, the anatomical, physiological, biochemical and physicochemical data required to build PBPK models are entered into a specifically designed electronic database. The different values for the various parameters are stored along with their source and an indication of their quality. The aim of the database is to make the selection of all the parameters for a model both easy and rapid. The output from the database is in a form which feeds into the MEG, which interacts with the user in a non-mathematical way. The mathematical equations are generated automatically, to enable them to be accessed and used with standard commercial simulation software packages. Currently, the MEG is a stand-alone code generator that eliminates the need to formulate and code a set of equations. The user is engaged in a dialogue relating to the details of the physiology of the system to be modelled, and to the biochemistry and physico-chemistry of the compound of interest. On the basis of this information, a script is produced. This greatly reduces development time (from days to minutes), and obviates the need for any mathematical expertise. The third component of the HSL initiative, the Sensitivity and Advanced Statistical Analysis of PBPK Model Output, is a piece of software for the rapid and quantitative assessment of the robustness of the models and the uncertainties associated with using them. It involves applying the extended Fourier Amplitude Sensitivity Test (FAST), a global sensitivity analysis technique which incorporates parameter interactions in non-linear models. Since many existing PBPK models can be generated in minutes, it is likely that the wider availability of the above software will transform the use of PBPK modelling, resulting in a much greater use of the approach. In summary, PBPK models have several advantages and uses relevant to risk assessment, including: 1) their ability to explain, as well as to describe, data, due to their mechanistic basis; 2) their ability to permit the prediction of tissue concentrations of chemicals for the determination of tissue dosimetry and the identification of potential target organs, which is crucial to risk assessment; and 3) their ability to permit extrapolation between different routes of administration, different species, high to low dosage, and different exposure scenarios. By analogy with the determination of the internal target organ dose in animal studies, it is important to assess the bioavailability of a test substance in an in vitro assay system, a process that could be called in vitro biokinetics. Gülden et al. have applied such considerations to improving the interpretation of quantitative data from in vitro toxicity assays. They have also used in vitro distribution modelling to convert the toxic concentrations of substances tested in tissue culture (the nominal toxic concentrations) to free available concentrations at the target site within the test system (free toxic concentrations; as is illustrated in Appendix 4 of the workshop report). Free toxic concentrations, as opposed to nominal toxic concentrations, are considered to be the appropriate, medium composition-independent measures of toxicity in vitro. Risk assessment is based on using tests that involve external exposure levels (doses) and external dose–response relationships (NOAEL, LOAEL, BMD). Target cell or target tissue concentrations cannot be used directly for quantitative HC. A principal way of converting toxic concentrations determined in vitro into internal and external doses in vivo — the “equivalent exposure” concept — has been proposed (as is discussed in Appendix 5 of the workshop report). The basic idea is that nominal concentrations of a chemical in vitro, and the internal or external doses of this chemical in vivo, are considered to be equivalent, if they result in the same free concentration of that chemical. This approach takes into account differences in the biokinetics of a chemical in vitro and in vivo (Figure 1). In vitro distribution modelling can be used to convert nominal toxic concentrations determined in vitro into free toxic concentrations, and in vivo distribution modelling can be applied to convert these free toxic concentrations into equivalent tissue concentrations. Biokinetic modelling, like PBPK modelling, taking into account resorption, biotransformation, excretion and distribution, can then be applied, to convert equivalent tissue concentrations into equivalent external doses. The equivalent exposure approach has been applied to the development of a prediction model for extrapolating equivalent serum concentrations from in vitro active concentrations (as is described in Appendix 5 of the report). Gülden et al. have also shown that the low sensitivity of in vitro cytotoxicity assays compared with in vivo fish acute toxicity studies for the same chemicals can, at least in part, be attributable to differences in the availabilities of chemicals in vivo and in vitro. PBPK modelling has also been used to extrapolate between effects seen in vitro and those predicted to occur in vivo, when combined with information on levels of cytotoxicity to target organ cells in culture, and this has been successfully used as a part of integrated testing strategies.
Risk Assessment Based on In Vitro Data
In analogy with the way in which animal data are used for risk assessment, the first step in using in vitro data is to obtain a no observable adverse effective concentration (e.g. in mg/mL), and to examine the nature of the effect of increasing concentration on toxicity in a suitable in vitro test system, by determining a concentration–effect relationship. Ideally, the test system should be selected according to: 1) the target species; and 2) the potential major target organ(s)/ tissue(s). The nature of the potential target organ(s) might also dictate the type of biotransformation system to be used, if appropriate (e.g. liver or lung subcellular enzyme fractions). The identity of potential target organs could be inferred from the results of biokinetic modelling, as well as from all the available knowledge on the target organs for related molecules. The in vitro data could then be adjusted, by using in vitro biokinetics to determine the available concentration at the in vitro cellular target site for the endpoint being measured (the free effective concentration), according to the nature of the in vitro test system (including medium composition, cell type, cell number, and the physicochemical properties of the test substance), as is explained in Appendix 5 of the workshop report, to provide what could be termed as the no observable free effective concentration in vitro. This concentration could be used for risk assessment in two main ways: 1) by applying adjustment factors (see Appendix 4 of the workshop report); and 2) by applying the equivalent exposure concept (as discussed earlier, and as summarised in Figure 1).
Applying in vitro data in practice
Regulatory testing is undertaken for new chemical substances, biocides, HPV chemicals, and food contact materials. The design of a testing programme should take account of: 1) financial costs; 2) the timeframe for when the results are needed; 3) the regulatory certainty of the result required; 4) animal welfare considerations; and 5) local animal protection legislation. Point 5) refers to any requirement to use the least stringent animal procedures, and to use non-animal alternatives wherever feasible. There are two different ways in which data from in vitro tests are currently used: 1) when a negative in vitro result is verified in animals; and 2) when a positive result is used to define a potential for toxicity that is confirmed by further in vivo testing. The latter use occurs routinely in the case of genotoxicity testing, where in vitro assays have been developed to be hypersensitive (for example, due to the use of DNA repair mutants, cells with enhanced permeability to xenobiotics, and the incorporation of exogenous enzyme fractions that preferentially activate chemicals). Usually, the main way in which a positive in vitro result is perceived as obviating the need for using animals, is when the result leads to the discarding of a chemical, or when it can be used for classifying and labelling the chemical as being toxic for the endpoint in question. Currently, the use of non-animal approaches for regulatory testing is limited to those tests that have been accepted by the regulatory agencies as: 1) definitive (for which both negative and positive results are acceptable); 2) for screening (for which only positive results are usually considered definitive); and 3) for providing supporting data (for which neither positive nor negative results on their own are considered sufficient). However, in certain cases, the regulatory authorities are beginning to accept a positive in vitro result as sufficient, when it is considered unlikely that there would be a good reason for a subsequent animal test to be negative. A good example of this is when positive in vitro genotoxicity data obviate the need to conduct a carcinogenicity bioassay, due to the high sensitivity of in vitro genotoxicity tests for predicting rodent carcinogenicity. Two other commonly used in vitro methods for regulatory toxicity testing are the Transcutaneous Electrical Resistance (TER) assay for corrosivity testing, and the 3T3 Neutral Red Uptake (NRU) phototoxicity test. Lastly, in silico prediction methods have only been accepted by regulatory bodies for providing supporting data, particularly for prioritising chemicals for further testing. An example of this application is the use of the knowledge-based expert system, DEREK, for predicting skin sensitisation and mutagenicity. In summary, while in vitro genotoxicity studies are widely accepted for classification and labelling purposes, in vivo data have traditionally been required for this purpose for other toxicity endpoints. However, there is an increasing trend toward labelling substances as hazardous on the basis of positive results from in vitro toxicity studies alone. [N.B. The workshop report contains some examples of case studies on the use of in vitro data for quantitative risk assessment. It also contains a detailed table on in vitro tests of relevance to regulatory toxicity testing.]
The FRAME workshop participants considered that the existing paradigm for RA could be substantially improved by using non-animal data, particularly for meeting the new regulatory demands resulting from new developments such as the EU REACH legislation and the requirements of the 7th Amendment to the EU Cosmetics Directive. This is because: 1) an accurate quantitative estimation of safety is often unnecessary or inappropriate for assessing risk; 2) in vitro data can sometimes be sufficient for classification and labelling; 3) by applying both in vitro and in vivo biokinetic modelling and other approaches, it should be possible to greatly increase the relevance of in vitro toxicity data (the fact that this practice is rare is no reason why it should not be followed in the future); and 4) it is, in principle, possible to develop ways of using quantitative information (concentration–response data) from in vitro tests to estimate equivalent internal and external doses in animals or humans, and/or to use adjustment factors specific to the correction of in vitro data. These ideas are summarised in Figures 1 and 2. Existing, and some possible future, schemes for risk assessment are presented in Figure 3. Currently, the regulatory agencies use in vivo data either alone or in conjunction with mechanistic information from in vitro tests. However, it is suggested that the use of in vitro data for risk assessment purposes, in conjunction with adjustment factors, can be justified, and that it would help in an overall risk assessment including the use of animal toxicity data. Thus, instead of the reliance of regulatory agencies primarily on in vivo data (pathway  in Figure 3), in vitro information could be used in a parallel way for risk assessment (pathway  in Figure 3, as originally proposed by Mark Chamberlain), or even in its own right (pathway  in Figure 3), according to the proposals suggested in the workshop report. In some cases, and particularly where in vivo data are lacking (as with many existing chemicals to be re-assessed under the REACH system), the possibility of using in vitro data alone for risk assessment should be seriously considered, as a means of easing the likely testing and regulatory burden imposed by the new legislation. The workshop participants accepted that gaining regulatory acceptance for the use of non-animal approaches alone for risk assessment will necessarily take time. However, this process would be greatly facilitated by: 1) the development of scientifically rigorous alternative approaches and strategies for their application to risk assessment; and 2) the involvement of regulators willing to accept and use them as a result of changes in the way the regulatory process is conducted and managed. The workshop participants also agreed a number of other conclusions:
- All risk assessments are iterative and should be revisited in the light of new test information, or as new insights into how to interpret the information become available.
- Currently, much regulatory emphasis is placed on the results of testing in animals. What testing is appropriate depends on the intended use of the test substance, and should be directed toward identifying the information which is critical to allowing a relevant risk assessment to be undertaken. It could also be conducted in stages, with more-detailed regulatory testing only being undertaken if previous evaluations in the series indicated that this was necessary. Current testing strategies for chemicals are based on generalities, and are targeted at obtaining regulatory clearance as quickly as possible. This is likely to lead to an emphasis on undertaking a full set of animal-based regulatory tests. Furthermore, as testing strategies tend to be agreed at high levels, this leads to a “lowest common denominator” requirement, with the result that more testing, rather than more-targeted and/or more-intelligent testing, is the usual outcome.
- Much more attention could be paid to exposure-driven risk assessment through the selection of different testing requirements for specific end-uses. In particular, there should be more-detailed consideration of possible exposure scenarios, as well as of levels and durations of exposures, possibly in combination with the investigation of physicochemical properties, structure–activity relationships, in vitro testing, and biomonitoring. The resulting information could be used to determine whether an in vivoregulatory test was critical for risk assessment.
- Both ethical and scientific concerns should lead to the greater use of non-animal testing at the expense of animal testing. Information from non-animal tests should be used in ways that optimise its relevance for predicting hazard and risk to humans, independently of any animal testing that might be conducted.
- All the tests and testing schemes used for risk assessment need to be independently validated, to show that they are relevant and reliable for their specific purposes.
- By using in vitromethods, it is already possible to identify several key mechanisms of toxicity, such as DNA damage, cell membrane and organelle membrane damage, and effects on intermediary metabolism, as well as those causing more-specific toxicity, such as phototoxicity, corrosivity, and damage to barriers. However, such information is mainly used for hazard identification, rather than hazard classification, for both non-regulatory and regulatory purposes. In addition, such information can also be sufficient for classification and labelling, as can positive data from tests such as the BCOP, IRE, HET-CAM assays, for identifying severe eye irritants.
- In order to be able to use hazard identification from non-animal methods, it is necessary to take account of factors that affect the biokinetics of the applied doses to the test system, both in the whole organism and in in vitrotissue culture models. The use of differentiated human cells from target organs, where necessary and when available, might lead to the production of more-relevant data.
- Predicting both likely and intended exposure and hazard are essential prerequisites for obtaining meaningful risk assessments for existing chemicals. For new chemicals, less information on exposure would be likely to be available, so more-general estimates of risk might therefore have to be made.
- An overall testing scheme involving PBPK modelling and a battery of non-animal methods could, and should, be devised to improve the use of non-animal methods in risk assessment.
- In such a scheme, the concentrations of test substance to be used in the in vitro tests should be defined on the basis of a tissue equivalent dose (taking into account the biokinetics of the test substance which affect its bioavailability at the target site). It should also be possible to develop adjustment factors (analogous to the adjustment factors used with in vivo hazard data) to enable the extrapolation of quantitative information (on dose–response) from an in vitrotest system to the human situation. However, such factors should only be used on a case-by-case basis, to account for the existence of residual risk, including that due to biological variation.
- The ways in which non-animal data could be used in risk assessment would depend, for example, on how much pre-existing data, particularly on exposure, were available, to enable a decision to be made as to whether to undertake biokinetic modelling or non-animal testing first. Also, the scheme might need to be used iteratively.
- The potential value of the “omics” technology is very large, but the methods are still very much at the developmental phase, as are new informatics approaches for processing and optimising the use of the large amounts of data they produce.
- Cost and complexity should not be used to argue against the development and use of the “omics” technologies, since establishing the facilities and infrastructure for conventional animal toxicity testing is also expensive, as well as logistically problematical.
The workshop report concluded with a number of recommendations:
- The use of in vitro biokinetics should be actively promoted, as it has great potential for facilitating the application of in vitro toxicity data for risk assessment, by enabling the tissue equivalent toxic dose in the in vitrotest system to be calculated.
- A strategy should be devised for hazard classification that involves the combined use of biokinetic modelling and in vitro biokinetics, together with in vitroapproaches to predicting no-effect dose levels (equivalent to NOAELs), eventually without the need for animal data.
- Non-animal methods, such as (Q)SAR modelling and read-across, should be used to facilitate hazard classification and risk assessment.
- There should be an initiative to compile an inventory of case studies detailing how expert opinion has been used to make risk assessments on the basis of non-animal and animal data, with respect to regulatory decisions made, and the outcome of any further safety testing undertaken.
- There is an urgent need at all levels to communicate, especially to members of the general public, clear and simple information about the differences and interrelationships between safety, hazard and risk associated with exposure to chemicals.
- It is recommended that maximum use is made of advances in obtaining clinical data (for example, from microdosing studies with volunteers) as a means of identifying useful biomarkers of exposure and effect, and to facilitate the detection of reliable early signs of toxicity in vitro, so that non-animal methods for hazard identification could be made more relevant.
- Regulatory and legislative bodies need to consider, now, how best to manage the necessary change in the way the risk assessment of chemicals should be undertaken, such that this can proceed through a phase involving the intelligently combined use of in vitro and in vivo data, followed by a progressive and scientifically-based switch to a greater reliance on information from non-animal methods and testing strategies alone.
©2007 Michael Balls & Robert Combes