Feature Based Classification for Automated Essay Scoring

Muhaimin Hading; Muhammad Ikhwan Burhan; Andi Nurfadillah Ali; A. Inayah Auliyah; Mardhiyyah Rafrin; Arliyanti Nurdin; Wiwit Melayu

doi:10.34012/jutikomp.v8i2.7397

Published Oct 29, 2025

https://doi.org/10.34012/jutikomp.v8i2.7397

Download

PDF

Statistic

Vol. 8 No. 2 (2025): Jurnal Teknologi dan Ilmu Komputer Prima (JUTIKOMP)

Muhaimin Hading

Institut Teknologi Bacharuddin Jusuf Habibie

Muhammad Ikhwan Burhan

Institut Teknologi Bacharuddin Jusuf Habibie

Andi Nurfadillah Ali

Institut Teknologi Bacharuddin Jusuf Habibie

A. Inayah Auliyah

Institut Teknologi Bacharuddin Jusuf Habibie

Mardhiyyah Rafrin

Institut Teknologi Bacharuddin Jusuf Habibie

Arliyanti Nurdin

Universitas Hasanuddin

Wiwit Melayu

Universitas Pendidikan Indonesia

Abstract

Essays play a crucial role in traditional assessments, but evaluating them accurately, efficiently, and fairly poses a major challenge for educators. Automated Essay Scoring (AES) aims to address this issue by leveraging computational techniques to support teachers in the grading process. This study explores a classification model to classify the score based on the feature we created. We incorporate additional features aligned with the ASAP 2.0 scoring rubric, such as Lexical Sophistication, Source Adherence, Novelty and Relevance, and a Semantic Disruption feature. These features are used to construct a distributed representation of essays, which is then input into a Support Vector Machine (SVM) model for holistic score prediction. The proposed model achieved a Quadratic Weighted Kappa (QWK) score of 0.8397, indicating a high level of agreement with human raters. The results demonstrate the effectiveness of combining rubric-informed features with a non-linear classifier. The findings can be implemented for educational settings, where the model can be utilized to provide scalable and consistent scoring support, reduce grading workload for instructors, and deliver timely feedback to students. By aligning with rubric-based criteria, the approach can also foster more transparent and constructive learning processes, helping students identify specific areas for improvement in their writing. While the model exhibits strong predictive performance, it also presents limitations related to interpretability and generalizability, especially across diverse writing prompts and domains.

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

How to Cite

Hading, M., Burhan, M. I., Ali, A. N., Auliyah, A. I., Rafrin, M., Nurdin, A., & Melayu, W. (2025). Feature Based Classification for Automated Essay Scoring. JURNAL TEKNOLOGI DAN ILMU KOMPUTER PRIMA (JUTIKOMP), 8(2), 126–135. https://doi.org/10.34012/jutikomp.v8i2.7397

References

Abdul Salam, M., El-Fatah, M. A., & Hassan, N. F. (2022). Automatic grading for Arabic short answer questions using optimized deep learning model. PLOS ONE, 17(8), e0272269. https://doi.org/10.1371/journal.pone.0272269
Asto Buditjahjanto, I. G. P., Idhom, M., Munoto, M., & Samani, M. (2022). An Automated Essay Scoring Based on Neural Networks to Predict and Classify Competence of Examinees in Community Academy. TEM Journal, 1694–1701. https://doi.org/10.18421/TEM114-34
Ayaan, A., & Ng, K.-W. (2025). Automated grading using natural language processing and semantic analysis. MethodsX, 14, 103395. https://doi.org/10.1016/j.mex.2025.103395
Bai, J. Y. H., Zawacki-Richter, O., Bozkurt, A., Lee, K., Fanguy, M., Cefa Sari, B., & Marin, V. I. (2022, September). Automated Essay Scoring (AES) Systems: Opportunities and Challenges for Open and Distance Education. Tenth Pan-Commonwealth Forum on Open Learning. https://doi.org/10.56059/pcf10.8339
Dini, L., Brunato, D., Dell’Orletta, F., & Caselli, T. (2025). TEXT-CAKE: Challenging Language Models on Local Text Coherence. In O. Rambow, L. Wanner, M. Apidianaki, H. Al-Khalifa, B. Di Eugenio, & S. Schockaert (Eds.), Proceedings of the 31st International Conference on Computational Linguistics (pp. 4384–4398). Association for Computational Linguistics. https://aclanthology.org/2025.coling-main.296/
Do, H., Kim, Y., & Lee, G. (2024). Autoregressive Score Generation for Multi-trait Essay Scoring. In Y. Graham & M. Purver (Eds.), Findings of the Association for Computational Linguistics: EACL 2024 (pp. 1659–1666). Association for Computational Linguistics. https://aclanthology.org/2024.findings-eacl.115/
Faseeh, M., Jaleel, A., Iqbal, N., Ghani, A., Abdusalomov, A., Mehmood, A., & Cho, Y.-I. (2024). Hybrid Approach to Automated Essay Scoring: Integrating Deep Learning Embeddings with Handcrafted Linguistic Features for Improved Accuracy. Mathematics, 12(21), 3416. https://doi.org/10.3390/math12213416
He, Y., Jiang, F., Chu, X., & Li, P. (2022). Automated Chinese Essay Scoring from Multiple Traits. In N. Calzolari, C.-R. Huang, H. Kim, J. Pustejovsky, L. Wanner, K.-S. Choi, P.-M. Ryu, H.-H. Chen, L. Donatelli, H. Ji, S. Kurohashi, P. Paggio, N. Xue, S. Kim, Y. Hahm, Z. He, T. K. Lee, E. Santus, F. Bond, & S.-H. Na (Eds.), Proceedings of the 29th International Conference on Computational Linguistics (pp. 3007–3016). International Committee on Computational Linguistics. https://aclanthology.org/2022.coling-1.266/
Li, S., & Ng, V. (2024). Automated Essay Scoring: A Reflection on the State of the Art. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 17876–17888. https://doi.org/10.18653/v1/2024.emnlp-main.991
Liang, G., On, B.-W., Jeong, D., Kim, H.-C., & Choi, G. S. (2018). Automated Essay Scoring: A Siamese Bidirectional LSTM Neural Network Architecture. Symmetry, 10(12), 682. https://doi.org/10.3390/sym10120682
Lubis, F. F., Mutaqin, M., Putri, A., Waskita, D., Sulistyaningtyas, T., Arman, A. A., & Rosmansyah, Y. (2021). Automated Short-Answer Grading using Semantic Similarity based on Word Embedding. International Journal of Technology, 12(3), 571. https://doi.org/10.14716/ijtech.v12i3.4651
Ludwig, S., Mayer, C., Hansen, C., Eilers, K., & Brandt, S. (2021). Automated Essay Scoring Using Transformer Models. Psych, 3(4), 897–915. https://doi.org/10.3390/psych3040056
Mesgar, M., & Strube, M. (2018). A Neural Local Coherence Model for Text Quality Assessment. In E. Riloff, D. Chiang, J. Hockenmaier, & J. Tsujii (Eds.), Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (pp. 4328–4339). Association for Computational Linguistics. https://doi.org/10.18653/v1/D18-1464
Mualfah, D., Fadila, W., & Firdaus, R. (2022). Teknik SMOTE untuk Mengatasi Imbalance Data pada Deteksi Penyakit Stroke Menggunakan Algoritma Random Forest. Jurnal CoSciTech (Computer Science and Information Technology), 3(2), 107–113. https://doi.org/10.37859/coscitech.v3i2.3912
Nadeem, F., Nguyen, H., Liu, Y., & Ostendorf, M. (2019). Automated Essay Scoring with Discourse-Aware Neural Models. In H. Yannakoudakis, E. Kochmar, C. Leacock, N. Madnani, I. Pilán, & T. Zesch (Eds.), Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications (pp. 484–493). Association for Computational Linguistics. https://doi.org/10.18653/v1/W19-4450
Pack, A., Barrett, A., & Escalante, J. (2024). Large language models and automated essay scoring of English language learner writing: Insights into validity and reliability. Computers and Education: Artificial Intelligence, 6, 100234. https://doi.org/10.1016/j.caeai.2024.100234
Ramesh, D., & Sanampudi, S. K. (2022). An Improved Approach for Automated Essay Scoring with LSTM and Word Embedding (pp. 35–41). https://doi.org/10.1007/978-981-16-6616-2_4
Reddy Chavva, R. K., Reddy Muthyam, S., Seelam, M. S., & Nalliboina, N. (2024). A Transformer-Based Approach for Enhancing Automated Essay Scoring. 2024 1st International Conference on Advanced Computing and Emerging Technologies (ACET), 1–6. https://doi.org/10.1109/ACET61898.2024.10730000
Shen, A., Mistica, M., Salehi, B., Li, H., Baldwin, T., & Qi, J. (2021). Evaluating Document Coherence Modeling. Transactions of the Association for Computational Linguistics, 9, 621–640. https://doi.org/10.1162/tacl_a_00388
Uto, M., Xie, Y., & Ueno, M. (2020). Neural Automated Essay Scoring Incorporating Handcrafted Features. Proceedings of the 28th International Conference on Computational Linguistics, 6077–6088. https://doi.org/10.18653/v1/2020.coling-main.535
Wang, Y., Wang, C., Li, R., & Lin, H. (2022). On the Use of Bert for Automated Essay Scoring: Joint Learning of Multi-Scale Essay Representation. Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 3416–3425. https://doi.org/10.18653/v1/2022.naacl-main.249
Wilson, J., & Shermis, M. (2024). The Routledge International Handbook of Automated Essay Evaluation. Routledge.
Yancey, K. P., Laflair, G., Verardi, A., & Burstein, J. (2023). Rating Short L2 Essays on the CEFR Scale with GPT-4. In E. Kochmar, J. Burstein, A. Horbach, R. Laarmann-Quante, N. Madnani, A. Tack, V. Yaneva, Z. Yuan, & T. Zesch (Eds.), Proceedings of the 18th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2023) (pp. 576–584). Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.bea-1.49
Yang, R., Cao, J., Wen, Z., Wu, Y., & He, X. (2020). Enhancing Automated Essay Scoring Performance via Fine-tuning Pre-trained Language Models with Combination of Regression and Ranking. In T. Cohn, Y. He, & Y. Liu (Eds.), Findings of the Association for Computational Linguistics: EMNLP 2020 (pp. 1560–1569). Association for Computational Linguistics. https://doi.org/10.18653/v1/2020.findings-emnlp.141
Zhang, H., & Litman, D. (2020). Automated Topical Component Extraction Using Neural Network Attention Scores from Source-based Essay Scoring. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 8569–8584. https://doi.org/10.18653/v1/2020.acl-main.759

About Journal

Feature Based Classification for Automated Essay Scoring

Abstract

References

Similar Articles

Similar Articles

Comparison of Support Vector Machine (SVM) and Decision Tree Methods in Lyme Disease Data Classification

PERBANDINGAN EFEKTIFITAS ALGORITMA DECISSION TREE, NAÏVE BAYES, K-NEAREST NEIGHBOR DAN SUPPORT VECTOR MACHINE DALAM MELAKUKAN KLASIFIKASI

Application of Support Vector Machine in Measuring Stress Levels Based on EEG Signals

Classification of Hypertension Using Support Vector Machine Based on Data Photoplethysmography and Blood Pressure Estimator

Text Mining dan Klasterisasi Sentimen Pada Ulasan Produk Toko Online

Klasifikasi Tingkat Kematangan Buah Kersen Dengan Menggunakan Support Vector Machine

Aplikasi Pendeteksi Asap Berbasis Arduino Uno pada Platform Android

Penggunaan Machine Learning Di Bidang Kesehatan

Analisis Interaksi Pengguna Sosial Media Sekolah di Palembang Berdasarkan Topik dengan hLDA dan SVM

Optimasi Metode Supervised Learning Dengan Menggunakan Particle Swarm Optimization Untuk Deteksi Malware

About Journal

##plugins.themes.academic_pro.article.sidebar##

##plugins.themes.academic_pro.article.main##

Abstract

##plugins.themes.academic_pro.article.details##

References

Similar Articles