System and method for interactive scoring of standardized...

Education and demonstration – Question or problem eliciting response

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C434S323000, C434S350000

Reexamination Certificate

active

06234806

ABSTRACT:

FIELD OF INVENTION
The present invention relates to computer based test scoring systems. More particularly, the present invention relates to a system and method for interactively scoring standardized test responses.
BACKGROUND OF THE INVENTION
For many years, standardized tests have been administered to examinees for various reasons such as educational testing or evaluating particular skills. For instance, academic skills tests (e.g., SATs, LSATs, GMATs, etc.) are typically administered to a large number of students. Results of these tests are used by colleges, universities, and other educational institutions as a factor in determining whether an examinee should be admitted to study at that educational institution. Other standardized testing is carried out to determine whether or not an individual has attained a specified level of knowledge, or mastery, of a given subject. Such testing is referred to as mastery testing (e.g., achievement tests offered to students in a variety of subjects and the results being used for college entrance decisions).
FIG. 1
depicts a sample question and sample direction which might be given on a standardized test. The stem
4
, the stimulus
5
, responses
6
, and directions
7
for responding to the stem
4
are collectively referred to as an item. The stem
4
refers to a test question or statement to which an examinee (i.e., the individual to whom the standardized test is being administered) is to respond. The stimulus
5
is the text and/or graphical information (e.g., a map, scale, graph, or reading passage) to which a stem
4
may refer. Often the same stimulus
5
is used with more than one stem
4
. Some items do not have a stimulus
5
. Items having a common stimulus
5
are defined as a set.
Items sharing common directions
7
are defined as a group. Thus, questions
8
-
14
in
FIG. 1
are part of the same group.
A typical standardized answer sheet for a multiple choice exam is shown in FIG.
2
. The examinee is required to select one of the responses according to the directions provided with each item and fill in the appropriate circle on the answer sheet. For instance, the correct answer to the question
13
stated by stem
4
is choice (B) of the responses
6
. Thus, the examinee's correct response to question
13
is to fill in the circle
8
corresponding to choice (B) as shown in FIG.
2
.
Standardized tests with answer sheets as shown in
FIG. 2
can be scored by automated scoring systems quickly, efficiently, and accurately. Since an examinee's response to each item is represented on an answer sheet simply as a filled in circle, a computer can be easily programmed to scan the answer sheet and to determine the examinee's response to each item. Further, since there is one, and only one, correct response to each item, the correct responses can be stored in a computer database and the computer can be programmed to compare the examinee's response against the correct response for each item, determine the examinee's score for each item, and, after all items have been scored, determine the examinee's overall score for the test.
In recent years, the demand for more sophisticated test items has forced test administrators to move away from standardized tests with strictly multiple choice responses and paper answer sheets. Architectural skills, for instance, cannot be examined adequately using strictly a multiple choice testing format. For example, test administrators have determined that to examine such skills adequately requires standardized tests that pose to the examinee the challenge of drafting a representative architectural drawing in response to a test question. Such a response might, for example, be developed on a computer-aided design (CAD) facility.
Such tests have frustrated the ability of computers to efficiently and accurately score examinees' responses. While an architectural drawing, for example, may contain some objective elements, its overall value as a response to a particular test question is measured to some degree subjectively. Thus, a computer can no longer simply scan in an examinee's responses and compare them to known responses in a database.
Initially, these tests were scored by human test evaluators who viewed the examinee's responses as a whole and scored the responses on a mostly subjective basis. This approach is obviously time consuming, and subjective. Thus, two examinees could submit exactly the same response to a particular item and still receive different scores depending on which test evaluator scored the response. A particular test evaluator might even assess different scores at different times for the same response.
Recently, computer systems have been developed that evaluate the examinee's responses more quickly, efficiently, and objectively. These systems use scoring engines programmed to identify certain features expected to be contained in a correct response. The various features are weighted according to their relative importance in the response. For example, one element of a model response to a particular item in an architectural aptitude test might be a vertical beam from four to six feet in length. The scoring engine for that item will determine whether the beam is in the examinee's response at all (one feature) and, if it is, whether it is vertical (a second feature) and whether it is between four and six feet in length (a third feature). If the beam is not in the response at all, the scoring engine might be programmed to give the examinee no credit at all for the response to that item. A feature such as this which is so critical to the response that the absence of the feature would be deemed a fatal error in the response is referred to as a fatal feature. If, for example, the beam is present and vertical, but is less than four feet long, the scoring engine might be programmed to give the examinee full credit for the existence of the beam, full credit for the fact that the beam is vertical, but no credit for the fact that the beam is less than four feet long. Since the length of the beam is deemed not to be critical to the response in this example, the examinee still receives partial credit for the response to the item. Such a feature is referred to as a non-fatal feature. Thus, the scoring engine determines the existence of all of the features expected in the response for a given item, assesses a score for each feature present, and then adds up the weighted feature scores to determine the item score. When all the items for a particular test for a given examinee have been scored accordingly, the system assesses an overall test score.
Separately, a human test evaluator can score an examinee's response(s) to a particular item, or to a group of items, or to a whole test. Once the computer has finished scoring the test, a test evaluator may then compare the computer generated score to the score assessed by the test evaluator. If the test evaluator disagrees with the computer generated score for a particular item, the test evaluator is forced to change the score for that test manually.
Thus, one problem with the current computer-based scoring systems is that these systems are batch systems and provide no mechanism for a test evaluator to change the computer generated score online (i.e., to interact with the computer to change the score of a particular item as soon as the computer has scored that item rather than having to wait for the computer to score all the items of a test).
Additionally, a test evaluator might determine that the scoring rubric for an item is flawed and that the scoring engine that applies the flawed rubric needs to be changed. Scoring engines are currently changed in one of two ways, depending on the complexity of the change required. If the test evaluator wishes to change only one or more criteria (e.g., the beam in the above architectural test example should be from five to six feet long instead of from four to six feet long), then a change can be effected by changing the criterion in a file called by the scoring e

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

System and method for interactive scoring of standardized... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with System and method for interactive scoring of standardized..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and System and method for interactive scoring of standardized... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2470562

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.