초록 close

본 연구는 우리나라에 적용 가능한 논술 채점 컴퓨터 프로그램을 개발하기 위한 기초 자료를 제공하기 위하여, 외국에서 개발되어 사용되고 있는 논술 채점 컴퓨터 프로그램 중 C-rater, SEAR, E-rater, IntelliMetric 프로그램을 소개하고 비교·분석하였다. 4종의 논술 채점 컴퓨터 프로그램의 공통점은 첫째, 분석적 채점 방식을 사용하고, 둘째, 자연언어처리 기술을 사용하여 학생들의 응답을 인지하고, 셋째, 채점자가 채점한 결과와 컴퓨터가 채점한 결과를 비교하여 채점의 타당성을 검증한다. 그러나 4종의 논술 채점 컴퓨터 프로그램은 채점기준의 설정 근거, 응답 수준과 응답의 개수, 응답에 대한 채점자의 전문성, 채점의 타당도 검증, 논술 길이의 제한, 창의적인 글에 대한 채점의 어려움 등의 문제가 제기된다. 논술 채점 컴퓨터 프로그램을 개발하기 위해서는 한국어의 자연언어처리가 이루어져야 하고, 교육평가학자, 국어학자 그리고 공학자의 협력이 필요하다.


Since the beginning of the 21st century, the paradigm of educational evaluation has gone through remarkable changes brought by constructionism. The changes lead to a growing demand for an essay type question-one of the supply type questions-, as it is suitable for constructionist evaluation, allowing students to freely access to questions and to freely complete answers. In addition, it can help make a correct assessment of a student’s ability to analyze, criticize, organize, synthesize, create, and solve problems. On the other side, there are problems with the consistency and reliability of scoring, and the production and assessment of test equipments. This study compared the four different automated essay scoring programs(C-rater, SEAR, E-rater, and IntelliMetric) developed in other countries, analyzed common and different things, problems and limits of each program. Common things: first, analytical scoring was adapted; second, scoring were based on the way people score essays; and third, results from scorers are compared with those from the programs for verification. The four programs also have the same problems with the establishment of valid criteria, the level and number of training essays, specialists in scoring training essays, methods of verifying the automated scoring, restrictive length of essays, and low scoring of creative and remarkable essays. The analysis and comparison of the programs brought a conclusion that C-rater, SEAR, E-rater, and IntelliMetric can be applied to Korean schools. The NLP(natural language process) for Korean is helpful for developing automated Korean essay-scoring systems. Also, experts in education, Korean language, and engineering should work together to develop automated programs.