Open Category
Entry ID
581
Participant Type
Team
Expected Stream
Stream 2: Identifying an educational problem and proposing a prototype solution.

Section A: Project Information

Project Title:
Grammatical Error Analysis for English Essay
Project Description (maximum 300 words):

The Grammatical Error Analysis for English Essay system is an AI-powered tool designed to help students enhance their English writing skills by providing automated, in-depth feedback on grammar and vocabulary errors. At the heart of the system is the combination of Natural Language Processing (NLP) and large language models (LLMs), which allows the system to detect, correct, classify, and explain errors in a student's essay. The system’s innovation lies in its fine-grained error analysis and a robust and validated error classification taxonomy, which offers precise and targeted corrections for various types of mistakes, such as verb tense, preposition misuse, and spelling errors.

Key design concepts include personalized learning experiences, where feedback is tailored to each student's proficiency level and the errors they consistently make. The system not only offers correction suggestions but also provides comprehensive explanations and educational feedback, helping students understand the underlying causes of their mistakes. This deep, structured feedback fosters independent learning, encouraging students to reflect on their errors and improve their language skills over time.

The potential impact of this project is significant in transforming the way students engage with English writing. It addresses the challenge of language proficiency by offering instant and detailed feedback that empowers learners to take control of their own improvement process. The system is scalable, adaptable, and sustainable, providing support to students from diverse backgrounds and contributing to equity and inclusion in education. It supports ongoing engagement by adjusting essay prompts and difficulty based on student progress, ensuring long-term development. Ultimately, the system enhances both learning outcomes and student confidence in their language abilities.


Section B: Participant Information

Personal Information (Team Member)
Title First Name Last Name Organisation/Institution Faculty/Department/Unit Email Phone Number Contact Person / Team Leader
Dr. Shen Wang Squirrel Ai Learning AI R&D swang224edu@gmail.com +86 18068771610
  • YES
Mr. Jingheng Ye Tsinghua University Shenzhen International Graduate School jingheng.cs@gmail.com +86 18122081584
Dr. Qingsong Wen Squirrel Ai Learning AI R&D qingsongedu@gmail.com +1 (425)520-1766

Section C: Project Details

Project Details
Please answer the questions from the perspectives below regarding your project.
1.Problem Identification and Relevance in Education (Maximum 300 words)

The motivation behind our project stems from observing the persistent challenge students face in mastering English grammar and vocabulary. English composition is a critical skill in education, but many students struggle with language errors that hinder their ability to express themselves clearly. These issues often arise from limited language proficiency, leading to frequent grammatical errors that affect students’ learning experiences and self-confidence. Recognizing this gap in helpful error analysis tools for students, we developed the Grammatical Error Analysis for English Essay system.

We hypothesize that by utilizing an AI-driven solution to automatically detect and analyze grammatical and vocabulary errors in English essays, we can help students improve their language skills more efficiently. Our system provides not only error identification and correction but also in-depth error descriptions and educational feedback, empowering students to reflect on their errors and improve their language proficiency over time. Our system will succeed because it combines the power of Natural Language Processing (NLP) and large language models (LLMs) to provide accurate and insightful error analysis, making language learning more engaging and personalized. Additionally, the system fosters independent learning by allowing students to receive instant feedback, significantly accelerating their learning process and improving their writing skills.

2a. Feasibility and Functionality (for Streams 1&2 only) (Maximum 300 words)

Our solution leverages cutting-edge Natural Language Processing (NLP) and large language models (LLMs) to automatically analyze students' English essays for grammar and vocabulary errors. NLP techniques and LLMs enable the system to detect, classify, and explain the errors. To support its development, we collected essay writing datasets for validation purposes. A research and teaching team ensures our system's accuracy and educational value.

To validate the market demand, we conducted surveys and interviews with educators, students, and language professionals to gauge interest in such a tool. We piloted the system with educators and students to gather real-world feedback on its effectiveness and usability.

The core functionality of the system is error analysis. The system automatically identifies various grammatical and vocabulary errors, including preposition misuse and spelling errors. It then provides a detailed analysis, including correction, error classification, severity level, description, and suggestions for improvement. To ensure a positive user experience, it gives accurate, relevant, and actionable feedback that helps students learn from their mistakes.

We will evaluate the effectiveness of each stage of error analysis using different metrics. For Error Detection, we will measure Precision (the proportion of detected errors that are correct), Recall (the proportion of actual errors detected), and the F1 Score (a balanced measure of both Precision and Recall). In Error Correction, we will assess Precision (the accuracy of proposed corrections), Recall (the ability to propose corrections for all errors), and the F0.5 Score (which places more weight on Precision). For Error Classification, Accuracy will be used to measure the percentage of correctly identified and classified errors. Finally, for Error Description, we will evaluate Accuracy (the clarity and correctness of explanations), Relevance (how well explanations match the specific error), and Sufficiency (whether the explanation provides enough detail, including actionable suggestions and grammar rules for improvement).

2b. Technical Implementation and Performance (for Stream 3&4 only) (Maximum 300 words)

N/A

3. Innovation and Creativity (Maximum 300 words)

Fine-grained error analysis provides students with more accurate, relevant, and comprehensive learning assistance. Traditional grammar check tools usually only offer simple correction suggestions, lacking in-depth error analysis. Our system, however, uses a fine-grained error classification method to not only precisely identify errors in students' essays but also provide targeted corrections and improvement suggestions. Each error is identified, categorized, graded, and explained.
The benefits of fine-grained error analysis are manifold. Firstly, by categorizing and grading each error, students can more clearly understand their weaknesses in language learning and focus their efforts on targeted practice. Secondly, the system not only provides correction suggestions but also offers detailed explanations of the errors' causes and relevant grammar points. This deep analysis helps students understand the nature of the mistakes, rather than just accepting the correction, thus enhancing their understanding of grammar and expression.

The error classification taxonomy used in our system has been validated and continually optimized by educational experts, with the following three main advantages:
1. Wide Coverage: Our taxonomy accurately covers nearly all errors without grouping them vaguely as "other errors." This detailed and comprehensive classification ensures that students receive thorough feedback and don’t miss any minor error types.
2. Non-overlapping: The different error types in the framework are conceptually and definitionally distinct. This clear hierarchical structure makes it easier for students to understand the root causes of each error.
3. Moderate Granularity: The framework maintains a balanced granularity in error classification, neither too detailed nor too generalized. Each error type represents a broad and typical category of mistakes, ensuring the classification is practical and accurate. For example, verb errors are further subdivided into verb tense errors and verb form errors to better reflect the student's specific issues.

4. Scalability and Sustainability (Maximum 300 words)

To ensure that our solution can scale effectively and meet increasing user demand, we have deployed the system on a cloud-based infrastructure. This setup enables us to dynamically allocate resources based on real-time user traffic, ensuring our system can scale seamlessly as more users engage with the platform. Currently, the system handles around 9,000 essay submissions each week from students across various grade levels, and we anticipate this number will continue to grow. The cloud infrastructure allows for easy scaling of both computing power and storage, ensuring that the system performs efficiently even as demand rises.

Our solution is designed with a focus on environmental sustainability, long-term user engagement, and adaptability to changing user needs. To begin with, the system tailors its essay prompts to the individual grade level of each student, offering varying levels of difficulty based on their current skill set. This personalized approach not only makes learning more engaging but also helps maintain motivation by providing students with challenges that are appropriate for their level of proficiency. As students progress, the system adjusts the difficulty of the tasks to keep them challenged, ensuring that they stay engaged over time.

Additionally, we plan to collect high-quality data and leverage supervised fine-tuning (SFT) to continuously improve the accuracy and clarity of the system's error analysis. High-quality data is collected from our system deployed online and curated by experienced educators. Then, SFT helps refine the system's performance by training it on a labeled dataset, enabling it to identify mistakes more precisely and explain them more relevantly. This iterative process ensures that the feedback provided by the system is not only accurate but also easy for students to understand, enhancing the learning experience.

5. Social Impact and Responsibility (Maximum 300 words)

Our solution is designed to address key educational challenges by providing timely and personalized feedback to students. By providing accessible and real-time feedback on students' writing, our system helps bridge the language proficiency gap and supports students from diverse backgrounds in improving their English skills. This contributes to greater educational equity by ensuring that students from all socio-economic backgrounds have access to quality learning tools that can support their language development. Moreover, the platform supports personalized learning by recommending essay topics tailored to each student's grade level, fostering an environment where learners feel engaged and supported.

To measure the social impact of our system, we will monitor several key metrics. First, we will track the weekly volume of essays processed, currently averaging 9,000 submissions, to assess the scale and reach of our solution. Additionally, we will monitor student errors, focusing on specific areas such as grammar and vocabulary, to evaluate how well the system helps students improve their writing skills over time. We will also gather regular feedback from students and educators to measure user satisfaction and engagement with the system. This feedback will guide adjustments to the system, ensuring it remains responsive to the evolving needs of the community. By tracking these metrics, we can continuously refine the system to better serve diverse learners and align with broader social goals such as equity, accessibility, and inclusion, ensuring that the platform has a lasting and positive impact on students' educational outcomes.

Do you have additional materials to upload?
No
PIC
Personal Information Collection Statement (PICS):
1. The personal data collected in this form will be used for activity-organizing, record keeping and reporting only. The collected personal data will be purged within 6 years after the event.
2. Please note that it is obligatory to provide the personal data required.
3. Your personal data collected will be kept by the LTTC and will not be transferred to outside parties.
4. You have the right to request access to and correction of information held by us about you. If you wish to access or correct your personal data, please contact our staff at lttc@eduhk.hk.
5. The University’s Privacy Policy Statement can be access at https://www.eduhk.hk/en/privacy-policy.
Agreement
  • I have read and agree to the competition rules and privacy policy.