Building The Longhand Archive

How a solo archival system gets built in public

View My GitHub Profile

CSJ v2.3 Phase 5 Checkpoint

Updated: 2026-04-13

Completed in this phase

Coverage now includes

  1. Change detection / normalization
    • whitespace-only text changes do not count as meaningful
    • salary changes are treated as critical meaningful changes
    • CSJ URL query noise is ignored during diffing
    • list-of-dict asset arrays normalize stably for diffing
  2. Lifecycle / asset helpers
    • direct URL verification returns withdrawn_confirmed on 404
    • direct URL verification returns missing_unconfirmed when the reference is still present on the page
    • asset extraction captures supporting links, attachments, and embeds
    • asset fields participate in meaningful diff detection
  3. Versioning behavior
    • first save creates current file + initial history + first_seen event
    • unchanged refresh creates no extra history version and emits refreshed event
    • meaningful field change creates a new history version and a field_changed event

Issue found and fixed during test run

Validation result Command run: python3 -m pytest /root/.hermes/workspace/csj/tests/test_change_detection.py /root/.hermes/workspace/csj/tests/test_lifecycle_and_assets.py /root/.hermes/workspace/csj/tests/test_versioning_behavior.py -q

Result:

Current status after Phase 5 Done:

Still worth refining later