Different in silico models have been developed and implemented for the evaluation of mammalian acute toxicity, exploring acute oral toxicity data expressed as median lethal dose (LD(50)). We compared five software programs (TOPKAT, ACD/ToxSuite, TerraQSAR, ADMET Predictor and T.E.S.T.) using a dataset of 7417 chemicals. We tested the models’ performance using the quantitative results and, in classification, the toxicity threshold defined within the Classifying, Labelling and Packaging (CLP) regulation. ACD gave the best results with r(2) of 0.79 and 0.66 accuracy. However, its performance dropped when considering the molecules not present in its training set, and the other models behaved similarly. We also considered the information on the applicability domain (AD), which improved the models’ performance, but not enough for the molecules external to the models’ training set. We also considered the chemical classes and found that all models gave high performance for certain classes (e.g. hydrazones and sulphides) while other classes were always badly predicted (e.g. aromatic secondary amides).