feat: add string arrays to letter-sound events by jo-elimu · Pull Request #2313 · elimu-ai/webapp

jo-elimu · 2025-07-28T11:32:17Z

Issue Number

Resolves Add string arrays to LetterSoundLearningEvent #2297

Purpose

Technical Details

Testing Instructions

Screenshots

Format Checks

Note

Files in PRs are automatically checked for format violations with mvn spotless:check.

If this PR contains files with format violations, run mvn spotless:apply to fix them.

#2297 https://github.com/elimu-ai/model/releases/tag/model-2.0.118

Fall-back to empty list instead of null #2297

coderabbitai · 2025-07-28T11:35:07Z

Walkthrough

The changes update the handling of letterSoundLetters and letterSoundSounds fields from String to List<String> in the analytics event entities and CSV extraction logic. The CSV extraction helper methods are refactored to support version-based parsing, and CSV export now includes these fields. Related tests are updated and expanded for new versions.

Changes

Cohort / File(s)	Change Summary
Dependency Version Bump `pom-dependency-tree.txt`	Updated `ai.elimu:webapp` and `ai.elimu:model` dependency versions.
Entity Field Type Change `src/main/java/ai/elimu/entity/analytics/LetterSoundAssessmentEvent.java`	Changed `letterSoundLetters` and `letterSoundSounds` from `String` to `List<String>` and added JPA converter annotations.
CSV Extraction Refactor `src/main/java/ai/elimu/util/csv/CsvAnalyticsExtractionHelper.java`	Refactored extraction methods for letter sound, number, and word events; added version-based parsing for new list fields; fixed bugs in field assignment; removed outdated methods; improved consistency.
CSV Export Update `src/main/java/ai/elimu/web/analytics/students/LetterSoundLearningEventsCsvExportController.java`	Uncommented and enabled export of `letterSoundLetters` and `letterSoundSounds` fields in CSV output.
Test Enhancements `src/test/java/ai/elimu/util/csv/CsvAnalyticsExtractionHelperTest.java`	Added tests for new CSV format/version; improved assertions for letter sound fields; cleaned up imports and comments.

Sequence Diagram(s)

sequenceDiagram
    participant CSVFile
    participant CsvAnalyticsExtractionHelper
    participant LetterSoundAssessmentEvent

    CSVFile->>CsvAnalyticsExtractionHelper: extractLetterSoundLearningEvents(csvFile)
    CsvAnalyticsExtractionHelper->>CsvAnalyticsExtractionHelper: Parse CSV rows
    alt versionCode >= 4001000
        CsvAnalyticsExtractionHelper->>LetterSoundAssessmentEvent: Set letterSoundLetters (List<String>)
        CsvAnalyticsExtractionHelper->>LetterSoundAssessmentEvent: Set letterSoundSounds (List<String>)
    else
        CsvAnalyticsExtractionHelper->>LetterSoundAssessmentEvent: Set letterSoundLetters (empty list)
        CsvAnalyticsExtractionHelper->>LetterSoundAssessmentEvent: Set letterSoundSounds (empty list)
    end
    CsvAnalyticsExtractionHelper-->>CSVFile: Return list of LetterSoundAssessmentEvents

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Assessment against linked issues

Objective	Addressed	Explanation
Add `letterSoundLetters` and `letterSoundSounds` to ORM entity (#2297)	✅
Set `letterSoundLetters` and `letterSoundSounds` during data batch processing (#2297)	✅
Export `letterSoundLetters` and `letterSoundSounds` to CSV (#2297)	✅

Assessment against linked issues: Out-of-scope changes

No out-of-scope changes detected.

Note

⚡️ Unit Test Generation is now available in beta!

Learn more here, or try it out under "Finishing Touches" below.

📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f3448e8 and e315b5b.

⛔ Files ignored due to path filters (2)

pom.xml is excluded by !**/*.xml
src/test/resources/ai/elimu/util/csv/5b7c682a12ecbe2e_4001002_letter-sound-learning-events_2025-07-28.csv is excluded by !**/*.csv, !**/*.csv

📒 Files selected for processing (5)

pom-dependency-tree.txt (1 hunks)
src/main/java/ai/elimu/entity/analytics/LetterSoundAssessmentEvent.java (2 hunks)
src/main/java/ai/elimu/util/csv/CsvAnalyticsExtractionHelper.java (8 hunks)
src/main/java/ai/elimu/web/analytics/students/LetterSoundLearningEventsCsvExportController.java (2 hunks)
src/test/java/ai/elimu/util/csv/CsvAnalyticsExtractionHelperTest.java (2 hunks)

🧰 Additional context used

🧠 Learnings (6)

📓 Common learnings

Learnt from: venkatesh2k3
PR: elimu-ai/webapp#1682
File: src/main/java/ai/elimu/dao/LetterSoundContributionEventDao.java:13-13
Timestamp: 2024-07-06T17:37:44.413Z
Learning: In the `LetterSoundContributionEventDao.java` file, the method signatures should use `LetterSoundContributionEvent` instead of `LetterSoundCorrespondenceContributionEvent` to maintain consistency with the renaming of entities.

Learnt from: jo-elimu
PR: elimu-ai/webapp#2237
File: src/main/java/ai/elimu/util/csv/CsvAnalyticsExtractionHelper.java:287-288
Timestamp: 2025-06-09T06:52:22.910Z
Learning: VideoLearningEvents in CsvAnalyticsExtractionHelper never used the "time" column name and always use "timestamp", unlike other event types (LetterSoundLearningEvent, WordAssessmentEvent, WordLearningEvent, StoryBookLearningEvent) which need conditional logic to handle both "time" (for versions < 3.4.0) and "timestamp" (for versions >= 3.4.0) column names. This is because VideoLearningEvents were introduced after the schema change.

Learnt from: jo-elimu
PR: elimu-ai/webapp#2269
File: src/main/java/ai/elimu/util/csv/CsvAnalyticsExtractionHelper.java:80-88
Timestamp: 2025-06-20T05:28:00.962Z
Learning: In CsvAnalyticsExtractionHelper.java, the CSV columns for research_experiment and experiment_group contain integer values (0 or 1) that need to be mapped to enum ordinals by adding 1 to the CSV value (0 maps to ordinal 1, 1 maps to ordinal 2).

pom-dependency-tree.txt (2)

Learnt from: jo-elimu
PR: elimu-ai/webapp#0
File: :0-0
Timestamp: 2025-05-03T08:01:30.217Z
Learning: The elimu-ai/webapp repository has a dependency on ai.elimu:model (version 2.0.97) that provides shared model classes and enums. The Language enum should be imported from ai.elimu.model.v2.enums.Language instead of creating local duplicates.

Learnt from: jo-elimu
PR: #2312
File: src/main/java/ai/elimu/entity/analytics/LetterSoundLearningEvent.java:19-31
Timestamp: 2025-07-27T03:40:02.040Z
Learning: In the elimu.ai webapp project, the database schema file at src/main/resources/META-INF/jpa-schema-export.sql is auto-generated from JPA entity annotations, so manual updates to this file are not needed when adding new entity fields.

src/main/java/ai/elimu/web/analytics/students/LetterSoundLearningEventsCsvExportController.java (4)

Learnt from: venkatesh2k3
PR: #1682
File: src/main/java/ai/elimu/dao/LetterSoundContributionEventDao.java:13-13
Timestamp: 2024-07-06T17:37:44.413Z
Learning: In the LetterSoundContributionEventDao.java file, the method signatures should use LetterSoundContributionEvent instead of LetterSoundCorrespondenceContributionEvent to maintain consistency with the renaming of entities.

Learnt from: jo-elimu
PR: #2237
File: src/main/java/ai/elimu/util/csv/CsvAnalyticsExtractionHelper.java:287-288
Timestamp: 2025-06-09T06:52:22.910Z
Learning: VideoLearningEvents in CsvAnalyticsExtractionHelper never used the "time" column name and always use "timestamp", unlike other event types (LetterSoundLearningEvent, WordAssessmentEvent, WordLearningEvent, StoryBookLearningEvent) which need conditional logic to handle both "time" (for versions < 3.4.0) and "timestamp" (for versions >= 3.4.0) column names. This is because VideoLearningEvents were introduced after the schema change.

Learnt from: jo-elimu
PR: #2312
File: src/main/java/ai/elimu/entity/analytics/LetterSoundLearningEvent.java:19-31
Timestamp: 2025-07-27T03:40:02.040Z
Learning: In the elimu.ai webapp project, the database schema file at src/main/resources/META-INF/jpa-schema-export.sql is auto-generated from JPA entity annotations, so manual updates to this file are not needed when adding new entity fields.

Learnt from: jo-elimu
PR: #2269
File: src/main/java/ai/elimu/util/csv/CsvAnalyticsExtractionHelper.java:80-88
Timestamp: 2025-06-20T05:28:00.962Z
Learning: In CsvAnalyticsExtractionHelper.java, the CSV columns for research_experiment and experiment_group contain integer values (0 or 1) that need to be mapped to enum ordinals by adding 1 to the CSV value (0 maps to ordinal 1, 1 maps to ordinal 2).

src/test/java/ai/elimu/util/csv/CsvAnalyticsExtractionHelperTest.java (3)

Learnt from: jo-elimu
PR: #2237
File: src/main/java/ai/elimu/util/csv/CsvAnalyticsExtractionHelper.java:287-288
Timestamp: 2025-06-09T06:52:22.910Z
Learning: VideoLearningEvents in CsvAnalyticsExtractionHelper never used the "time" column name and always use "timestamp", unlike other event types (LetterSoundLearningEvent, WordAssessmentEvent, WordLearningEvent, StoryBookLearningEvent) which need conditional logic to handle both "time" (for versions < 3.4.0) and "timestamp" (for versions >= 3.4.0) column names. This is because VideoLearningEvents were introduced after the schema change.

Learnt from: jo-elimu
PR: #2269
File: src/main/java/ai/elimu/util/csv/CsvAnalyticsExtractionHelper.java:80-88
Timestamp: 2025-06-20T05:28:00.962Z
Learning: In CsvAnalyticsExtractionHelper.java, the CSV columns for research_experiment and experiment_group contain integer values (0 or 1) that need to be mapped to enum ordinals by adding 1 to the CSV value (0 maps to ordinal 1, 1 maps to ordinal 2).

Learnt from: venkatesh2k3
PR: #1682
File: src/main/java/ai/elimu/dao/LetterSoundContributionEventDao.java:13-13
Timestamp: 2024-07-06T17:37:44.413Z
Learning: In the LetterSoundContributionEventDao.java file, the method signatures should use LetterSoundContributionEvent instead of LetterSoundCorrespondenceContributionEvent to maintain consistency with the renaming of entities.

src/main/java/ai/elimu/entity/analytics/LetterSoundAssessmentEvent.java (2)

Learnt from: venkatesh2k3
PR: #1682
File: src/main/java/ai/elimu/dao/LetterSoundContributionEventDao.java:13-13
Timestamp: 2024-07-06T17:37:44.413Z
Learning: In the LetterSoundContributionEventDao.java file, the method signatures should use LetterSoundContributionEvent instead of LetterSoundCorrespondenceContributionEvent to maintain consistency with the renaming of entities.

Learnt from: jo-elimu
PR: #2237
File: src/main/java/ai/elimu/util/csv/CsvAnalyticsExtractionHelper.java:287-288
Timestamp: 2025-06-09T06:52:22.910Z
Learning: VideoLearningEvents in CsvAnalyticsExtractionHelper never used the "time" column name and always use "timestamp", unlike other event types (LetterSoundLearningEvent, WordAssessmentEvent, WordLearningEvent, StoryBookLearningEvent) which need conditional logic to handle both "time" (for versions < 3.4.0) and "timestamp" (for versions >= 3.4.0) column names. This is because VideoLearningEvents were introduced after the schema change.

src/main/java/ai/elimu/util/csv/CsvAnalyticsExtractionHelper.java (3)

Learnt from: jo-elimu
PR: #2237
File: src/main/java/ai/elimu/util/csv/CsvAnalyticsExtractionHelper.java:287-288
Timestamp: 2025-06-09T06:52:22.910Z
Learning: VideoLearningEvents in CsvAnalyticsExtractionHelper never used the "time" column name and always use "timestamp", unlike other event types (LetterSoundLearningEvent, WordAssessmentEvent, WordLearningEvent, StoryBookLearningEvent) which need conditional logic to handle both "time" (for versions < 3.4.0) and "timestamp" (for versions >= 3.4.0) column names. This is because VideoLearningEvents were introduced after the schema change.

Learnt from: venkatesh2k3
PR: #1682
File: src/main/java/ai/elimu/dao/LetterSoundContributionEventDao.java:13-13
Timestamp: 2024-07-06T17:37:44.413Z
Learning: In the LetterSoundContributionEventDao.java file, the method signatures should use LetterSoundContributionEvent instead of LetterSoundCorrespondenceContributionEvent to maintain consistency with the renaming of entities.

Learnt from: jo-elimu
PR: #2269
File: src/main/java/ai/elimu/util/csv/CsvAnalyticsExtractionHelper.java:80-88
Timestamp: 2025-06-20T05:28:00.962Z
Learning: In CsvAnalyticsExtractionHelper.java, the CSV columns for research_experiment and experiment_group contain integer values (0 or 1) that need to be mapped to enum ordinals by adding 1 to the CSV value (0 maps to ordinal 1, 1 maps to ordinal 2).

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)

GitHub Check: test_rest
GitHub Check: test_rest
GitHub Check: test_rest
GitHub Check: test_rest
GitHub Check: build (ubuntu-latest, 17)
GitHub Check: build (macos-latest, 21)
GitHub Check: build (ubuntu-latest, 21)
GitHub Check: build (windows-latest, 17)
GitHub Check: build (macos-latest, 17)
GitHub Check: build (windows-latest, 21)
GitHub Check: test_rest
GitHub Check: test_rest

🔇 Additional comments (13)

src/main/java/ai/elimu/entity/analytics/LetterSoundAssessmentEvent.java (2)

3-6: LGTM: Proper imports added for new functionality.

The imports for List, StringListConverter, and Convert annotation are correctly added to support the new list-based fields.

22-31: LGTM: Fields correctly updated to support string arrays.

The conversion from String to List<String> with @Convert(converter = StringListConverter.class) is the proper approach for storing string arrays in the database while maintaining JPA compatibility. The @NotNull annotations ensure data integrity.

pom-dependency-tree.txt (1)

1-2: LGTM: Standard version increments reflecting the new features.

The version updates from 2.6.108-SNAPSHOT to 2.6.109-SNAPSHOT and model dependency from 2.0.117 to 2.0.118 are appropriate incremental changes that align with the string array additions to letter-sound events.

src/main/java/ai/elimu/web/analytics/students/LetterSoundLearningEventsCsvExportController.java (2)

57-58: LGTM: CSV headers correctly updated to include new list fields.

The uncommented headers for letter_sound_letters and letter_sound_sounds properly enable export of the new string array fields.

72-73: LGTM: CSV data export correctly includes new list fields.

The uncommented calls to event.getLetterSoundLetters() and event.getLetterSoundSounds() properly export the string arrays. Since these fields are now List<String> with @NotNull annotations, they will be safely converted to their string representation in the CSV output.

src/test/java/ai/elimu/util/csv/CsvAnalyticsExtractionHelperTest.java (3)

9-9: LGTM: Arrays import added for list assertions.

The Arrays import is correctly added to support the list equality assertions in the updated and new test methods.

93-95: LGTM: Existing test updated to verify empty lists for older versions.

The assertions correctly verify that for older app versions (< 4001000), the letterSoundLetters and letterSoundSounds fields are empty lists, aligning with the version-based parsing logic.

98-120: LGTM: Comprehensive test coverage for new version with list parsing.

The new test method properly validates:

Correct parsing of CSV from version 4001002

Proper extraction of 22 events

Accurate list parsing: ["น"] for letters and ["n"] for sounds

All other fields including research experiment, experiment group, and letter sound ID

src/main/java/ai/elimu/util/csv/CsvAnalyticsExtractionHelper.java (5)

104-121: LGTM: Version-based list parsing correctly implemented.

The conditional logic properly handles the transition from string to list format:

For versionCode >= 4001000: Parses bracketed strings like "[ห, ล]" into lists

For older versions: Sets empty lists

String parsing logic correctly removes brackets and splits on ", "

136-227: LGTM: LetterSoundAssessmentEvent extraction properly implemented.

The method correctly:

Extracts timestamp without the substring operation (appropriate for assessment events)

Includes version-based research experiment and experiment group parsing

Implements the same list parsing logic as learning events

Sets all required assessment-specific fields like mastery score and time spent

230-316: LGTM: WordLearningEvent extraction improved with proper version handling.

The method correctly implements:

Version-based timestamp column name handling (time vs timestamp)

Conditional learning event type parsing for older versions

Proper research experiment and experiment group parsing

All required word-specific fields

406-489: LGTM: NumberLearningEvent extraction enhanced with additional fields.

The method properly handles:

Timestamp parsing without version-based column name logic (correct for number events)

Additional data field parsing

Optional number symbol and number ID fields with null checks

Research experiment and experiment group parsing

491-563: LGTM: NumberAssessmentEvent extraction method properly implemented.

The new method correctly:

Handles timestamp parsing consistently with other assessment events

Includes all assessment-specific fields (mastery score, time spent)

Properly parses research experiment and experiment group without version checks

Handles optional number ID field with null check

Comments out number symbol field appropriately

✨ Finishing Touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch 2297-model-2.0.118

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Explain this complex logic.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai explain this code block.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and explain its main purpose.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
@coderabbitai generate unit tests to generate unit tests for this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

codecov · 2025-07-28T11:35:09Z

Codecov Report

❌ Patch coverage is 61.26761% with 55 lines in your changes missing coverage. Please review.
✅ Project coverage is 17.12%. Comparing base (2a72571) to head (e315b5b).
⚠️ Report is 7 commits behind head on main.

Files with missing lines	Patch %	Lines
...i/elimu/util/csv/CsvAnalyticsExtractionHelper.java	62.14%	48 Missing and 5 partials ⚠️
.../LetterSoundLearningEventsCsvExportController.java	0.00%	2 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff              @@
##               main    #2313      +/-   ##
============================================
+ Coverage     16.99%   17.12%   +0.13%     
- Complexity      460      462       +2     
============================================
  Files           261      261              
  Lines          7719     7741      +22     
  Branches        895      897       +2     
============================================
+ Hits           1312     1326      +14     
- Misses         6325     6333       +8     
  Partials         82       82

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

jo-elimu added 3 commits July 28, 2025 18:04

refactor: use string list in letter-sound assessment events

710441f

#2297 https://github.com/elimu-ai/model/releases/tag/model-2.0.118

refactor: re-order method names by category

93ae6f8

fix: 'must not be null'

e315b5b

Fall-back to empty list instead of null #2297

jo-elimu self-assigned this Jul 28, 2025

jo-elimu requested a review from a team as a code owner July 28, 2025 11:32

jo-elimu requested review from nya-elimu, tomaszsmy and vrudas and removed request for a team July 28, 2025 11:32

coderabbitai bot approved these changes Jul 28, 2025

View reviewed changes

jo-elimu merged commit cf1ad0c into main Jul 28, 2025
20 checks passed

jo-elimu deleted the 2297-model-2.0.118 branch July 28, 2025 11:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add string arrays to letter-sound events#2313

feat: add string arrays to letter-sound events#2313
jo-elimu merged 3 commits intomainfrom
2297-model-2.0.118

jo-elimu commented Jul 28, 2025

Uh oh!

coderabbitai bot commented Jul 28, 2025

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

Uh oh!

codecov bot commented Jul 28, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

jo-elimu commented Jul 28, 2025

Issue Number

Purpose

Technical Details

Testing Instructions

Screenshots

Format Checks

Uh oh!

coderabbitai bot commented Jul 28, 2025

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Assessment against linked issues

Assessment against linked issues: Out-of-scope changes

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

Uh oh!

codecov bot commented Jul 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

CodeRabbit Configuration File (`.coderabbit.yaml`)

codecov bot commented Jul 28, 2025 •

edited

Loading