CSJ v2.3 Phase 5 Checkpoint
Updated: 2026-04-13
Completed in this phase
- Added fixture-style pytest coverage under:
/root/.hermes/workspace/csj/tests/test_change_detection.py
/root/.hermes/workspace/csj/tests/test_lifecycle_and_assets.py
/root/.hermes/workspace/csj/tests/test_versioning_behavior.py
Coverage now includes
- Change detection / normalization
- whitespace-only text changes do not count as meaningful
- salary changes are treated as critical meaningful changes
- CSJ URL query noise is ignored during diffing
- list-of-dict asset arrays normalize stably for diffing
- Lifecycle / asset helpers
- direct URL verification returns
withdrawn_confirmed on 404
- direct URL verification returns
missing_unconfirmed when the reference is still present on the page
- asset extraction captures supporting links, attachments, and embeds
- asset fields participate in meaningful diff detection
- Versioning behavior
- first save creates current file + initial history + first_seen event
- unchanged refresh creates no extra history version and emits refreshed event
- meaningful field change creates a new history version and a field_changed event
Issue found and fixed during test run
- Initial test run exposed that text normalization still treated simple newline/spacing changes as meaningful.
- Updated
normalize_text_for_diff() to collapse all whitespace (\s+) to a single space for diffing.
- Re-ran tests successfully.
Validation result
Command run:
python3 -m pytest /root/.hermes/workspace/csj/tests/test_change_detection.py /root/.hermes/workspace/csj/tests/test_lifecycle_and_assets.py /root/.hermes/workspace/csj/tests/test_versioning_behavior.py -q
Result:
Current status after Phase 5
Done:
- historical versioning foundations
- event logging foundations
- lifecycle classification foundations
- supporting asset extraction foundations
- fixture-based regression coverage for core helper behavior
Still worth refining later
- more realistic HTML fixtures saved under a dedicated fixtures directory
- tests for full lifecycle-pass orchestration rather than helper-level coverage only
- more opinionated filtering of generic outbound campaign/support links
- explicit tests for reopened / withdrawn_confirmed in multi-run state progression