Feature Based Classification for Automated Essay Scoring

##plugins.themes.academic_pro.article.main##

Muhaimin Hading
Muhammad Ikhwan Burhan
Andi Nurfadillah Ali
A. Inayah Auliyah
Mardhiyyah Rafrin
Arliyanti Nurdin
Wiwit Melayu

Abstract

Essays play a crucial role in traditional assessments, but evaluating them accurately, efficiently, and fairly poses a major challenge for educators. Automated Essay Scoring (AES) aims to address this issue by leveraging computational techniques to support teachers in the grading process. This study explores a classification model to classify the score based on the feature we created. We incorporate additional features aligned with the ASAP 2.0 scoring rubric, such as Lexical Sophistication, Source Adherence, Novelty and Relevance, and a Semantic Disruption feature. These features are used to construct a distributed representation of essays, which is then input into a Support Vector Machine (SVM) model for holistic score prediction. The proposed model achieved a Quadratic Weighted Kappa (QWK) score of 0.8397, indicating a high level of agreement with human raters. The results demonstrate the effectiveness of combining rubric-informed features with a non-linear classifier. The findings can be implemented for educational settings, where the model can be utilized to provide scalable and consistent scoring support, reduce grading workload for instructors, and deliver timely feedback to students. By aligning with rubric-based criteria, the approach can also foster more transparent and constructive learning processes, helping students identify specific areas for improvement in their writing. While the model exhibits strong predictive performance, it also presents limitations related to interpretability and generalizability, especially across diverse writing prompts and domains.

##plugins.themes.academic_pro.article.details##

How to Cite
Hading, M., Burhan, M. I., Ali, A. N., Auliyah, A. I., Rafrin, M., Nurdin, A., & Melayu, W. (2025). Feature Based Classification for Automated Essay Scoring. JURNAL TEKNOLOGI DAN ILMU KOMPUTER PRIMA (JUTIKOMP), 8(2), 126–135. https://doi.org/10.34012/jutikomp.v8i2.7397

References

  1. Abdul Salam, M., El-Fatah, M. A., & Hassan, N. F. (2022). Automatic grading for Arabic short answer questions using optimized deep learning model. PLOS ONE, 17(8), e0272269. https://doi.org/10.1371/journal.pone.0272269
  2. Asto Buditjahjanto, I. G. P., Idhom, M., Munoto, M., & Samani, M. (2022). An Automated Essay Scoring Based on Neural Networks to Predict and Classify Competence of Examinees in Community Academy. TEM Journal, 1694–1701. https://doi.org/10.18421/TEM114-34
  3. Ayaan, A., & Ng, K.-W. (2025). Automated grading using natural language processing and semantic analysis. MethodsX, 14, 103395. https://doi.org/10.1016/j.mex.2025.103395
  4. Bai, J. Y. H., Zawacki-Richter, O., Bozkurt, A., Lee, K., Fanguy, M., Cefa Sari, B., & Marin, V. I. (2022, September). Automated Essay Scoring (AES) Systems: Opportunities and Challenges for Open and Distance Education. Tenth Pan-Commonwealth Forum on Open Learning. https://doi.org/10.56059/pcf10.8339
  5. Dini, L., Brunato, D., Dell’Orletta, F., & Caselli, T. (2025). TEXT-CAKE: Challenging Language Models on Local Text Coherence. In O. Rambow, L. Wanner, M. Apidianaki, H. Al-Khalifa, B. Di Eugenio, & S. Schockaert (Eds.), Proceedings of the 31st International Conference on Computational Linguistics (pp. 4384–4398). Association for Computational Linguistics. https://aclanthology.org/2025.coling-main.296/
  6. Do, H., Kim, Y., & Lee, G. (2024). Autoregressive Score Generation for Multi-trait Essay Scoring. In Y. Graham & M. Purver (Eds.), Findings of the Association for Computational Linguistics: EACL 2024 (pp. 1659–1666). Association for Computational Linguistics. https://aclanthology.org/2024.findings-eacl.115/
  7. Faseeh, M., Jaleel, A., Iqbal, N., Ghani, A., Abdusalomov, A., Mehmood, A., & Cho, Y.-I. (2024). Hybrid Approach to Automated Essay Scoring: Integrating Deep Learning Embeddings with Handcrafted Linguistic Features for Improved Accuracy. Mathematics, 12(21), 3416. https://doi.org/10.3390/math12213416
  8. He, Y., Jiang, F., Chu, X., & Li, P. (2022). Automated Chinese Essay Scoring from Multiple Traits. In N. Calzolari, C.-R. Huang, H. Kim, J. Pustejovsky, L. Wanner, K.-S. Choi, P.-M. Ryu, H.-H. Chen, L. Donatelli, H. Ji, S. Kurohashi, P. Paggio, N. Xue, S. Kim, Y. Hahm, Z. He, T. K. Lee, E. Santus, F. Bond, & S.-H. Na (Eds.), Proceedings of the 29th International Conference on Computational Linguistics (pp. 3007–3016). International Committee on Computational Linguistics. https://aclanthology.org/2022.coling-1.266/
  9. Li, S., & Ng, V. (2024). Automated Essay Scoring: A Reflection on the State of the Art. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 17876–17888. https://doi.org/10.18653/v1/2024.emnlp-main.991
  10. Liang, G., On, B.-W., Jeong, D., Kim, H.-C., & Choi, G. S. (2018). Automated Essay Scoring: A Siamese Bidirectional LSTM Neural Network Architecture. Symmetry, 10(12), 682. https://doi.org/10.3390/sym10120682
  11. Lubis, F. F., Mutaqin, M., Putri, A., Waskita, D., Sulistyaningtyas, T., Arman, A. A., & Rosmansyah, Y. (2021). Automated Short-Answer Grading using Semantic Similarity based on Word Embedding. International Journal of Technology, 12(3), 571. https://doi.org/10.14716/ijtech.v12i3.4651
  12. Ludwig, S., Mayer, C., Hansen, C., Eilers, K., & Brandt, S. (2021). Automated Essay Scoring Using Transformer Models. Psych, 3(4), 897–915. https://doi.org/10.3390/psych3040056
  13. Mesgar, M., & Strube, M. (2018). A Neural Local Coherence Model for Text Quality Assessment. In E. Riloff, D. Chiang, J. Hockenmaier, & J. Tsujii (Eds.), Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (pp. 4328–4339). Association for Computational Linguistics. https://doi.org/10.18653/v1/D18-1464
  14. Mualfah, D., Fadila, W., & Firdaus, R. (2022). Teknik SMOTE untuk Mengatasi Imbalance Data pada Deteksi Penyakit Stroke Menggunakan Algoritma Random Forest. Jurnal CoSciTech (Computer Science and Information Technology), 3(2), 107–113. https://doi.org/10.37859/coscitech.v3i2.3912
  15. Nadeem, F., Nguyen, H., Liu, Y., & Ostendorf, M. (2019). Automated Essay Scoring with Discourse-Aware Neural Models. In H. Yannakoudakis, E. Kochmar, C. Leacock, N. Madnani, I. Pilán, & T. Zesch (Eds.), Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications (pp. 484–493). Association for Computational Linguistics. https://doi.org/10.18653/v1/W19-4450
  16. Pack, A., Barrett, A., & Escalante, J. (2024). Large language models and automated essay scoring of English language learner writing: Insights into validity and reliability. Computers and Education: Artificial Intelligence, 6, 100234. https://doi.org/10.1016/j.caeai.2024.100234
  17. Ramesh, D., & Sanampudi, S. K. (2022). An Improved Approach for Automated Essay Scoring with LSTM and Word Embedding (pp. 35–41). https://doi.org/10.1007/978-981-16-6616-2_4
  18. Reddy Chavva, R. K., Reddy Muthyam, S., Seelam, M. S., & Nalliboina, N. (2024). A Transformer-Based Approach for Enhancing Automated Essay Scoring. 2024 1st International Conference on Advanced Computing and Emerging Technologies (ACET), 1–6. https://doi.org/10.1109/ACET61898.2024.10730000
  19. Shen, A., Mistica, M., Salehi, B., Li, H., Baldwin, T., & Qi, J. (2021). Evaluating Document Coherence Modeling. Transactions of the Association for Computational Linguistics, 9, 621–640. https://doi.org/10.1162/tacl_a_00388
  20. Uto, M., Xie, Y., & Ueno, M. (2020). Neural Automated Essay Scoring Incorporating Handcrafted Features. Proceedings of the 28th International Conference on Computational Linguistics, 6077–6088. https://doi.org/10.18653/v1/2020.coling-main.535
  21. Wang, Y., Wang, C., Li, R., & Lin, H. (2022). On the Use of Bert for Automated Essay Scoring: Joint Learning of Multi-Scale Essay Representation. Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 3416–3425. https://doi.org/10.18653/v1/2022.naacl-main.249
  22. Wilson, J., & Shermis, M. (2024). The Routledge International Handbook of Automated Essay Evaluation. Routledge.
  23. Yancey, K. P., Laflair, G., Verardi, A., & Burstein, J. (2023). Rating Short L2 Essays on the CEFR Scale with GPT-4. In E. Kochmar, J. Burstein, A. Horbach, R. Laarmann-Quante, N. Madnani, A. Tack, V. Yaneva, Z. Yuan, & T. Zesch (Eds.), Proceedings of the 18th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2023) (pp. 576–584). Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.bea-1.49
  24. Yang, R., Cao, J., Wen, Z., Wu, Y., & He, X. (2020). Enhancing Automated Essay Scoring Performance via Fine-tuning Pre-trained Language Models with Combination of Regression and Ranking. In T. Cohn, Y. He, & Y. Liu (Eds.), Findings of the Association for Computational Linguistics: EMNLP 2020 (pp. 1560–1569). Association for Computational Linguistics. https://doi.org/10.18653/v1/2020.findings-emnlp.141
  25. Zhang, H., & Litman, D. (2020). Automated Topical Component Extraction Using Neural Network Attention Scores from Source-based Essay Scoring. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 8569–8584. https://doi.org/10.18653/v1/2020.acl-main.759

Similar Articles

1 2 3 > >> 

You may also start an advanced similarity search for this article.