QUALITIES OF A GOOD TEST

INTRODUCTION

Anulekha M - 2113312005005

The ultimate goal of education in any form is to ensure the effective transfer of knowledge to the students. An evaluation of sorts is required then to verify how much of said knowledge transfer is complete and capable of recollection and integrated into practice. A good test or evaluation has this responsibility to hold.

Furthermore, a good test is a feedback for the students, the teachers/educators and the management and government making policies and drafting education plans.

For a student, the result of a test should indicate their strong and weak areas so that they can consistently maintain their strengths and improve their weaknesses. This also points them to their field of interest so that they can pursue a career out of it.

For the teachers and educators the results of a test should indicate the effectiveness of their teaching styles, methods, aids and materials and the capability of the students to understand and reproduce what is taught in isolation. This can help them constantly improve their craft which in turn results in the students learning better.

For the management and governments, the results of the tests conducted provide them with data that they can analyze to modify the education system so that it is tailor made for the students to gain maximum benefit from learning.

As seen, the importance of a test cannot be understated. It is one of the most vital parts of education and learning. There are certain qualities that define a good test:

Objectivity
Reliability
Validity
Administrability

There are also multiple tools of evaluation or mediums in which a test is conducted in education. Detailed information about each of them is found down below.

OBJECTIVITY

Anulekha M - 2113312005005

Objectivity is a quality that eliminates the influence of personal biases, feelings, emotions and false facts. It is often linked to observation as part of the scientific method. It is thus intimately related to the aim of testability and reproducibility.

Gronlund and Linn states that “Objectivity of a test refers to the degree to which equally competent scores obtain the same results. So a test is considered objective when it makes for the elimination of the scorer’s personal opinion and biased judgement. In this context there are two aspects of objectivity which should be kept in mind while constructing a test.”

These 2 aspects are:

Objectivity in evaluation
Objectivity in interpretation

Objectivity in evaluation refers to the absolute nature of right and wrong of the answers. This means that any two evaluators can score a test and they should arrive at the same scores without any margin of error. This removes the influence of personal judgement of the evaluators and the role of the identity of the test taker irrelevant. Only the answer to the given question is key to the scoring of the test.

Objectivity in interpretation refers to the sameness in how any two students understand a question in a test. It means that the question calls for one singular right answer. Therefore questions that have multiple possible right answers and dual meanings are usually avoided. Objectivity in interpretation removes vagueness and ambiguity.

Having seen the properties of objectivity one can understand that multiple choice questions or mathematical, scientific and grammatical type questions can easily maintain objectivity as they are based fundamentally on hard facts and singular right answers. But in terms of analytical and critical questions that are in the form of an essay, objectivity cannot be maintained by creating a preexisting hard and fast text that every student must blindly memorize and reproduce to be evaluated properly. Instead, criteria regarding writing styles, prominent facts that have to be present in the essay, and how arguments should be constructed and the academic language used can be created and informed to the students and the evaluators and the interpretive questions be evaluated objectively on that basis.

This is how and why a test has to be objective to be good.

RELIABILITY

Aarthi T - 2113312005001

Reliability is consistency, dependence, and trust. This means that the results of a reliable test should be dependable. They should remain stable, consistent, not be different when the test is used on different days). A reliable test yield similar results with a similar group of students who took the same test under identical conditions.

Thus reliability has three aspects, Reliability of the test itself, Reliability of the way in which it has been marked, Reliability of the way in which it has been administered. The three aspects of reliability are named: equivalence, stability, and internal consistency (homogeneity). The first aspect, equivalence, refers to the amount of agreement between two or more tests that are administered at nearly the same point in time. Equivalence is measured by administering two parallel forms of the same test to the same group. This administration of the parallel forms occurs at the same time or following some time delay. The second aspect of reliability, stability, is said to occur when similar scores are obtained with repeated testing with the same group of respondents. In other words, the scores are consistent from one time to the next. Stability is assessed by administering the same test to the same individuals under the same conditions after some period of time. The third and last aspect of reliability is internal consistency (or homogeneity). Internal consistency concerns the extent to which items on the test are measuring the same thing. Reliability may be threatened by various factors. For example : A subjective test such as an oral interview or a test involving the production of a written text may be unreliable if there are not clearly defined marking criteria which all markers adhere to. Otherwise, different markers might award/detract marks for different factors, thus threatening mark-remark reliability - i.e if the same oral interview was marked by two different markers, the result might be quite different because one was concentrating on accuracy while the other gave more credit for fluency.. The conditions under which it is taken. Eg if during a listening test there is suddenly a lot of background noise, the candidates are unlikely to achieve the same results as they would in a silent room. Test-retest reliability is therefore threatened. Confusing rubrics: If the learner doesn't fully understand the instructions she/he will go "off track" in the task and therefore obtain a score which is lower than s/he would otherwise have done. Again this threatens test-retest reliability. If eg the learner has recognise the problem half way through the test, on a retake s/he would be prepared and would do better. The length of the test. Longer tests produce more reliable results than very brief quizzes. In general, the more items on a test, the more reliable it is considered to be. The administration of the test which include the classroom setting (lighting, seating arrangements, acoustics, lack of intrusive noise etc.) and how the teacher manages the test administration. Affective status of students. Test anxiety can affect students’ test results.

VALIDITY

Anushiya Mary Y - 2113312005006

The term validity refers to whether or not the test measures what it claims to measure. On a test with high validity, the items will be closely linked to the test’s intended focus. Unless a test is valid it serves no useful function.

One of the most important types of validity for teachers is content validity which means that the test assesses the course content and the outcomes using formats familiar to the students.

Content validity is the extent to which the selection of tasks in a test is representative of the larger set of tasks of which the test is assumed to be a sample. A test needs to be a representative sample of the teaching contents as defined and covered in the curriculum.

Like reliability, there are also some factors that affect the validity of test scores.

The first important characteristic of a good test is validity. The test must really measure what it has been designed to measure. Validity is often assessed by exploring how the test scores correspond to some criteria, that is same behaviour, personal accomplishment or characteristic that reflects the attribute that the test designed to gauge. Assessing the validity of any test requires careful selection of appropriate criterion measure and that reasonable people may disagree as to which criterion measure is best. This is equally true of intelligence test. Reasonable people may disagree as to whether the best criterion measure of intelligence in school grades, teacher ratings or some other measures. If we are to check on the validity of a test, we must settle on one or more criterion measures of the attribute that the test is designed to test. Once the criterion measures have been identified people scores on the measures can be compared to their scores on the test and the degree of correspondence can be examined for what it tells us about the validity of the test.

FACTORS IN THE TEST:

Unclear directions to students to respond to the test.

The difficulty of reading vocabulary and sentence structure.

Too easy or too difficult test items.

Ambiguous statements in the test items.

Inappropriate test items for measuring a particular outcome.

Inadequate time provided to take the test.The length of the test is too short.Test items not arranged in order of difficulty.

FACTORS IN THE TEST ADMINISTRATION AND SCORING:

Unfair aid to individual students, who ask for help,

Cheating by students during testing.

Unreliable scoring of essay type answers.

Insufficient time to complete the test.

Adverse physical and psychological condition at the time of testing.

FACTORS RELATED TO STUDENTS:

Test anxiety of the students.

Physical and Psychological state of the student.

It means that it measures what its composer supposes to measure. It tests what it ought to test.

Example: The test which measures control of grammar should have no difficult lexical items.

ADMINISTRABILITY

Abitha - 2113312005002

Administering a test

Once the items, directions, and answer key have been written, the teacher should consider the manner in which the test will be presented in advance. Factors such as duplication, visual aids, and use of the blackboard should be considered in advance to insure clarity in presentation as well as to avoid technical difficulties.

Establish Classroom Policy

Because discipline is a major factor in test administration, the teacher must establish a classroom policy concerning such matters as tardiness, absences, make-ups, leaving the room, and cheating. The teacher must also advise students of procedural rules such as:

° What to do if they have any questions.

° What to do when they are finished taking the test.

° What to do if they run out of paper, need a new pen, etc.

° What to do if they run out of time.

Time required for administration

In order to provide appropriate time to take the test, if the time is reduced, then the reliability of the test will also reduce. A safe procedure to allocate as much time as the test requires for providing reliable and valid results. Between 20 to 60minutes is a fairly good time for each individual score yielded by a published test.

The teacher should always be aware of the effect of testing conditions on testing outcomes. Physical shortcomings should be alleviated wherever possible.

Similarly, psychological conditions can inhibit optimal performance. Such factors as motivation, test anxiety, temporary states (everyone has a bad day once in a while), and long-term changes can profoundly effect the test-taker and therefore his/her performance on the test. If some students cannot see the blackboard, they should be allowed to move to a better location. If students are cramped into benches, more benches should be brought in and students should be spread out. If this is not possible, two separate tests can be written and distributed to students on an alternating basis. It is therefore the teacher's responsibility to establish an official, yet not oppressive, atmosphere in the testing room to maximize student performance.

TOOLS OF EVALUATION

Aananda bairavi - 2113312005003

There are many tools or instruments used in evaluation process. Some of the tools have been briefly discussed here:

Questionnaire

Most commonly used method of evaluation is questionnaire in which an individual attempts answers in writing on a paper. It is generally self-administered in which person goes through the questionnaire and responds as per the instruction. It is considered to be the most cost-effective tool of evaluation in terms of administration. While developing teacher should ensure that it is simple, concise, and clearly stated. Evaluation done with the help of questionnaire is quantitative.

Interview

Interview is the second most important technique used for evaluation in which students participating in evaluation are interviewed. Interview can help in getting information both quantitatively and qualitatively. Interview can be conducted in group or individually. It is a time-consuming process; therefore it should be arranged as per the convenience of interviewer and interviewee. It can also be used to evaluate a programme at the time of exist of student called exit interview. Interview should be held in a quiet room and the information obtained should be kept confidential. An interview guide can be created, which is an objective guideline to be followed by the interviewer.

Observations

Observation is the direct visualization of the activity performed by the student. It is very useful in assessing the performance of the students, to know how many skills they have attained. Observation is needed to be recorded simultaneously, if delayed some important points of the observation could be missed. There is scope for subjectivity in observation and the same can be overcome by developing an objective criterion. Students should also be aware of the criteria, so that they can prepare themselves accordingly and their anxiety levels will be controlled. Teacher should also prepare himself to enable fair assessment.

Rating Scale

Rating scale is another tool of assessment in which the performance of the student is measured on a continuum. Rating scale provides objectivity to the assessment. Later on, grades can be given to the students based on their performance on rating scale.

Checklist

Checklist is a two-dimensional tool used to assess the behavior of the student, for its presence or absence. Teacher can evaluate the performance of the student with a detailed checklist of items and well-defined and developed criteria. Checklist is an important tool that can evaluate students' performance in the clinical area. Order in which steps to be used to complete the procedure can be put in sequential order, which help the teacher to check whether the required action is carried out or not. It is an important tool used in both summative and formative assessment

Attitude Scale

An attitude scale measures the feeling of the students at the time of answering the question. Likert scale is the most popular. Attitude scale contains a group of statements (usually 10-15) that reflect the opinion on a particular issue. Participant (student) is asked the degree to which he agrees or disagrees with the statements. Usually, five point Likert scale is used to assess the attitude of the student. To avoid any kind of bias, equal number of positively and negatively framed statements is included.

Semantic Differential

Another scale used to measure the attitude of the student is semantic differential. This tool contains bipolar scale (adjectives) like good-bad, rich-poor, positive-negative, active-passive, etc. Number of intervals between two adjectives is usually old like five or seven, so that the middle figure represents neutral attitude.

Self-Report or Diary

A self-report or diary is a narrative record maintained by the student, which reflects his critical thoughts after careful observation. It can be a one-time assignment or regular assignment. Regular assignment is maintained in a spiral book which can be evaluated on daily, weekly, monthly or semester basis. Self-report or diary helps in improving any existing programme or constructing a new one based on self-report submitted by the student.

Anecdotal Notes

Anecdotal record is the note maintained by teacher on the performance or behavior of student during clinical experience. It proves to be a very valuable tool for both formative and summative evaluation of the student's performance. It is maintained soon after the occurrence of event. It is an assessment done on continuous basis that allows student to be judged fairly. It is the duty of the teacher to give feedback to the student.

Search This Blog

PG English Ethiraj