Int J Performability Eng ›› 2023, Vol. 19 ›› Issue (6): 388-396.doi: 10.23940/ijpe.23.06.p4.388396

Previous Articles     Next Articles

Ensemble Learning for Appraising English Text Readability using Gompertz Function

Rakesh Kumara, Sunny Arorab, Ashima Aryac, Neha Kohlid, Vaishali Aryad, and Ekta Singhe,*   

  1. aRevenue Department, Medline Industries India Private Limited, Pune, India;
    bDepartment of Computer Science and Engineering, SRM University, Delhi-NCR, Sonepat (Haryana), India;
    cDepartment of Computer Science and Information Technology, KIET Group of Institutions, Delhi-NCR, Ghaziabad, India;
    dDepartment of Computer Science and Engineering, GD Goenka University, Sohna (Gurgaon), India;
    eDepartment of Humanities and Social Sciences, Jaypee Institute of Information Technology, Noida, India
  • Contact: * E-mail address:ms.shivani.batra@gmail.com

Abstract: To fulfill individuals' informational demands, text readability is crucial. The assessing necessity of text readability is rising as a result of the enormous increase of contemporary content. An ensemble approach to learning utilizing the Gompertz function is suggested to assess the readability of English writings in light of word, sentence, and text arrangement. The conventional approach of measuring the readability of English literature depends excessively on the capacity of artificial experts to identify characteristics, which restricts its applicability. It becomes increasingly challenging to manually identify deep features due to the diversity and volume of text being used, as well as the readability assessment characteristics that must be extracted, and it is simple to add redundant or unnecessary characteristics, which hurts the effectiveness of the framework. For this study, the authors experimented with 25,000 English sentences. Furthermore, they were classified by Flesch-Kincaid and annotated into seven distinct readability categories. The study proposes an ensemble based model that employs five machine learning models as its base classifiers. The outcomes produced by the suggested ensemble based model are outstanding and reliable. The suggested model had an accuracy, precision, recall and F-score of 90.58%, 0.9545, 0.9467 and 0.9506, respectively on the test set. The created model may be applied in educational settings for tasks like language acquisition and evaluating an individual's reading and writing skills.

Key words: classification, ensemble learning, Gompertz function, machine learning, readability