Bakeoff 2013: Chinese Spelling Check

Data sets and the evaluation tool are publicly released.

  Please download the package at


  • Shih-Hung Wu, Chaoyang University of Technology
  • Chao-Lin Liu, National Chengchi University 
  • Lung-Hao Lee, National Taiwan University

    The participants of this task need to follow the instructions to complete the registration

    1. Download the registration form.

    2. Fill in the form and Sign it.

    3. Send the scanned file to Lung-Hao Lee (



   Spelling check is a common task in every written language, which is an automatic mechanism to detect and correct human errors. However, spelling check in Chinese is very different from that in English or other alphabetical languages. There are no word delimiters between words and the length of each word is very short: usually one to three characters. Therefore, error detection is a hard problem; it must be done within a context, say a sentence or a long phrase with a certain meaning, and cannot be done within one word. Once an error is identified, it is possible to correct the error since most of the errors are phonologically similar or visually similar characters [1]. There are several previous works addressing the spelling check problem. Till now, there is no commonly available data set for spelling check in Chinese. The goal of this task is to provide a common evaluation data set so that application developers can compare their error detection and correction rates.

    In this bake-off, the evaluation includes two sub-tasks: error detection and error correction. The errors are collected from students’ written essays. Since there are less than 2 errors per essay [2], in this bake-off the distribution of incorrect characters will match the real world error distribution in the sub-task one. The first sub-task focuses on the evaluation of error detection. The input sentences might consist of no error to evaluate the false-alarm rate of a system [3]. The second sub-task focuses on the evaluation of error correction. Each sentence includes at least one error. The ability to accomplish these two sub-tasks is the complete function of a spelling checker. The task attendants may submit their results for only one of the sub-tasks or both.


Task Description

The goals of this task is to evaluate the ability of a system on Chinese spelling check. The task can be further divided into two sub-tasks:
Sub-Task 1: Error  Detection
     For the error detection sub-task, complete Chinese sentences with/without spelling errors will be given as the input, the system should return the locations of the incorrect characters. Each character or punctuation occupies 1 for counting location. The error detection problem is a yes/no question plus the locations of errors. If the input contains no spelling errors, the system should return: NID, 0. If the input contains at least one spelling errors, the output format is: NID, error_location [, error_location]*
        Example 1  
          Input : (NID=99999) 在我的人生中沒有風災大浪,但我看過許多勇敢的人,不怕折的奮鬥,這種精神值得我們學習。
           Output :  99999, 27
        Example 2  
          Input : (NID=88888) 擁有六百一十年歷史的禮門,象著南韓人的精神,在一夕之,被火燒得精光。
           Output :  88888, 16, 29
    Performance Metrics
  • False-Alarm Rate = # of sentences with false positive error detection results / # of testing sentences without errors
  • Detection Accuracy = # of sentences with correctly detected results / # of all testing sentences
  • Detection Precision = # of sentences with correctly error detected results / # of sentences the system return as with errors
  • Detection Recall = # of sentences with correctly error detected results / # of testing sentences with errors
  • Detection F-Score( 2 * Detection Precision * Detection Recall ) / ( Detection Precision + Detection Recall )
  • Error Location Accuracy = # of sentences with correct location detection  / # of all testing sentences
  • Error Location Precision = # of sentences with correct error locations  / # of sentences the system returns as with errors
  • Error Location Recall = # of sentences with correct error locations / # of testing sentences with errors
  • Error Location F-Score( 2 * Error Location Precision * Error Location Recall ) / ( Error Location Precision + Error Location Recall )
Sub-Task 2: Error  Correction
   For the error correction sub-task, the input texts complete Chinese sentences with spelling errors. The system should return the locations of the incorrect characters, and must point out the correct characters. The error correction problem is a follow-up problem of error detection for sentences with errors. Since the input contains at least one spelling error, the output format is: NID[, error_location, correction]+.  
        Example 1  
          Input : (NID=99999) 在我的人生中沒有風災大浪,但我看過許多勇敢的人,不怕折的奮鬥,這種精神值得我們學習。
           Output :  99999, 27, 挫
        Example 2  
          Input : (NID=88888) 擁有六百一十年歷史的禮門,象著南韓人的精神,在一夕之,被火燒得精光。
           Output :  88888, 16, 徵29, 間
     Performance Metrics
  • Location Accuracy = # of sentences correctly detected the error location / # of all testing sentences 
  • Correction Accuracy = # of sentences correctly corrected the error / # of all testing sentences
  • Correction Precision = # of sentences correctly corrected the error / # of sentences the system returns corrections

Data Sets

    Test datawe provide one test set of each sub-task. Each set contains 1000 Chinese texts selected from students’ essays which covered various common errors. The policy of our evaluation is an open test. Participants can employ any linguistic and computational resources to do identification and correction.


   We provide the Sample Set and Similar Character Set (abbrev. Bakeoff 2013 CSC Datasets) for this evaluation.  

    (1). Sample  set: the samples will be selected from students’ essays. The data will be released in XML format.

            Example :  

                    <DOC Nid="00018">

                    <p>有些人會拿這次的教訓來勉勵自己,好讓自己在打混摸魚時警悌,使自己比以前更好、更進步。 </p>


                        <MISTAKE wrong_position=28>






    (2). Similar Character Set:  the set of Chinese characters with similar shapes or pronunciations is useful for this task. 


                    Similar Shape: 可, 何呵坷奇河柯苛阿倚寄崎荷蚵軻

                    Similar  Pronunciation: 右, 幼鼬誘宥柚祐有侑莠又囿佑釉

         Please citate the paper as a reference for using this data set: Chao-Lin Liu, Min-Hua Lai, Kan-Wen Tien, Yi-Hsuan Chuang, Shih-Hung Wu, and Chia-Ying Lee. Visually and phonologically similar characters in incorrect Chinese words: Analyses, identification, and applicationsACM Transactions on Asian Language Information Processing10(2), 10:1-39. Association for Computing Machinery, USA, June 2011.


Bake-off Reports

    Each participant must submit an evaluation report to describe the spelling checker and its testing results. Please follow the SIGHAN-7 template ( to prepare the report. Your report is limited to five pages. Non-conforming submissions will not be considered for review. All submitted reports that conform to the specified length and formatting requirements will be included in the SIGHAN-7 proceedings.  At least one author of each accepted report will be required to register for the workshop to present the developed system. This is the most valuable part of participation, as authors will be able to engage workshop attendees in extended conversations about their work.


Important Dates

  • Registration for bake-off open: May 20, 2013 
  • CSC Datasets released open: May 31, 2013 
  • Registration for bake-off deadline: July 1, 2013
  • CSC Datasets released deadline: July 5, 2013 
  • Dry run (format validation) data released: July 15, 2013
  • Dry run submission deadline: July 26, 2013
  • Test data released: July 31, 2013
  • Test results submission deadline: August  2, 2013
  • Test results evaluation released: August  4, 2013
  • Bake-off report submission deadline: August 16, 2013
  • Bake-off  report reviews returned: August 20, 2013
  • Camera-ready submission deadline: August 23, 2013
  • Main Workshop: October 14, 2013

    Liang-Pu Chen and Ping-Che Yang,  the r
esearch engineers of the institute for information industry, Taiwan, are appreciated for supporting  students’ essays in this Chinese Spelling Check task. 



[1]  Chao-Lin Liu, Min-Hua Lai, Kan-Wen Tien, Yi-Hsuan Chuang, Shih-Hung Wu, and Chia-Ying Lee (2011). Visually and phonologically similar characters in incorrect Chinese words: Analyses, identification, and applications, ACM Trans. Asian Lang. Inform. Process. 10, 2, Article 10 (June 2011), 39 pages.

[2]  Yong-Zhi Chen, Shih-Hung Wu, Ping-che Yang, Tsun Ku, and Gwo-Dong Chen (2011). Improve the detection of improperly used Chinese characters in students’ essays with error model. Int. J. Cont. Engineering Education and Life-Long Learning, Vol. 21, No. 1, pp.103-116, 2011.

[3]  Shih-Hung Wu, Yong-Zhi Chen, Ping-che Yang, Tsun Ku, and Chao-Lin Liu (2010). Reducing the False Alarm Rate of Chinese Character Error Detection and Correction, Proceedings of CIPS-SIGHAN Joint Conference on Chinese Language Processing (CLP 2010), pages 54–61, Beijing, 28-29 Aug., 2010.