Multiple choice questions (MCQs) are one of the most popular tools to evaluate learning and knowledge in higher education. Nowadays, there are a few indices to measure reliability and validity of these questions, for instance, to check the difficulty of a particular question (item) or the ability to discriminate from less to more knowledge. In this work two new indices have been constructed: (i) the no answer index measures the relationship between the number of errors and the number of no answers; (ii) the homogeneity index measures homogeneity of the wrong responses (distractors). The indices are based on the lack-of-fit statistic, whose distribution is approximated by a chi-square distribution for a large number of errors. An algorithm combining several traditional and new indices has been developed to refine continuously a database of MCQs. The final objective of this work is the classification of MCQs from a large database of items in order to produce an automated-supervised system of generating tests with specific characteristics, such as more or less difficulty or capacity of discriminating knowledge of the topic.