Phase 2.4 integration smoke — the last safety net before flipping the public Ensemble grid live. If any test in this suite fails after a merge to
main, do not dashboard-fire the affected project until the regression is investigated.
What it covers
tests/integration/ drives a real bin/control-panel-server.py
subprocess and exercises the magic-moment flow end-to-end:
| File | Scope | Slot directive |
|---|---|---|
test_e2e_signup_to_publish.py |
signup → procedural fill → edit → publish → unpublish → re-publish | Test 1 |
test_e2e_visitor_flow.py |
unauth visitor reads + private-canvas blocking + slot-limit 402 + payload-too-large 413 + XSS escape + cookie banner + ToS gate | Tests 2-5 |
test_e2e_browser.py |
optional Playwright DOM smoke (skipped when Playwright is not installed) | bonus |
The session fixture in conftest.py boots the server against a tempdir
for $CLAUDE_PLUGIN_DATA + $ORCH_DATA and wipes
/tmp/ensemble-dashboard/canvases + /tmp/ensemble-dashboard/subscriptions
before the run so each invocation starts clean. The dashboard output dir
under the same prefix is left alone so a developer running the real
control panel locally does not lose their generated HTML.
When to run
- Before flipping the Ensemble grid public. Required.
- Before merging any PR that touches
bin/control-panel-server.py,bin/dashgen/canvas_*.py,bin/dashgen/pairing.py,bin/dashgen/route_table.py, or any of the legal / cookie / first-run templates. - Weekly as a scheduled run while the marketing page is live, to catch drift caused by silent dependency upgrades.
How to run
One-shot
bash bin/run_e2e_smoke.sh
The wrapper resolves a Python interpreter that has pytest available.
On a fresh Mac, it creates a venv at /tmp/ensemble-e2e-venv and
installs pytest + markdown into it. Override with:
E2E_PYTHON=/path/to/python bash bin/run_e2e_smoke.sh
Filter / verbose
run_e2e_smoke.sh forwards any extra args to pytest:
bash bin/run_e2e_smoke.sh -v # verbose
bash bin/run_e2e_smoke.sh -k visitor # only visitor tests
bash bin/run_e2e_smoke.sh tests/integration/test_e2e_signup_to_publish.py
With Playwright
/tmp/ensemble-e2e-venv/bin/python -m pip install playwright
/tmp/ensemble-e2e-venv/bin/python -m playwright install chromium
bash bin/run_e2e_smoke.sh
The 3 browser tests in test_e2e_browser.py will pick up the install
automatically; without it they emit a UserWarning and skip.
Interpreting failures
| Failing test | Likely root cause |
|---|---|
test_signup_seeds_procedural_canvas |
Procedural-fill fallback templates broken or signup_context routing changed in canvas_handlers.handle_create_canvas. |
test_publish_to_slot_zero_appears_in_grid |
Canvas store _index.json missing the published list, OR /ensemble/canvases?filter=featured query rewriting regressed. |
test_unpublish_removes_canvas_from_grid |
_remove_from_index not clearing published. |
test_canvas_detail_endpoint_serves_public_canvas |
_dispatch_canvas_get route regex no longer matches /ensemble/canvas/<id>. |
test_visitor_cannot_publish / test_visitor_cannot_modify |
Pairing-token middleware (_resolve_paired_user) no longer rejects empty headers. |
test_visitor_cannot_see_private_canvas |
handle_get_canvas private-visibility check removed or inverted. |
test_free_tier_fourth_slot_returns_402_with_contract_shape |
Slot-limit enforcement broken or 402 body shape drifted from docs/architecture/api-contracts/CHUNK_2_API_CONTRACT.md §2. |
test_swap_slot_zero_demotes_old_canvas_to_draft |
Atomic swap in handle_publish_canvas regressed — old occupant should land in draft state, not be deleted. |
test_text_component_with_script_tag_is_escaped_in_static_render |
bin/dashgen/pages/canvas_view.py no longer applies replace("</", "<\\/") on the inline boot JSON. High-severity XSS regression. |
test_javascript_url_in_click_to_link_is_filtered_in_static_render |
Static page is leaking javascript: URL into an href attribute. High-severity XSS regression. |
test_payload_too_large_returns_413 |
Body-size cap in do_POST removed or raised. |
test_state_endpoint_without_auth_returns_visitor_safe |
Visitor-safe /state shape now leaks privileged keys. Production-state-leak regression. |
test_index_html_contains_cookie_banner |
Cookie banner template removed from bin/dashgen/__main__.py or pages/cookie_banner.py. |
test_index_html_contains_first_run_tos_checkbox |
First-run modal lost the ToS checkbox or /legal/terms.html link. |
test_legal_terms_page_renders |
legal_pages.render_legal_pages skipped or markdown library missing. The test skips (rather than fails) if the page is absent — investigate the dashboard regen log. |
Manual rollback
If the smoke catches a regression on main:
- Identify the offending commit:
bash git log --oneline -- bin/dashgen/canvas_handlers.py bin/dashgen/pages/canvas_view.py bin/control-panel-server.py - Revert (do not amend; do not force-push):
bash git revert <sha> - Re-run the smoke before merging the revert:
bash bash bin/run_e2e_smoke.sh - If the failing test was an XSS or state-leak (the High-severity rows above), keep the public grid behind the feature gate until the revert lands.
CI integration sketch
A skeleton GitHub Actions workflow lives at
.github/workflows/integration-test.yml (slot S1A enables the
auto-trigger after the Sprint 5 ramp-up; until then it is dispatched
manually). Required runner permissions: nothing beyond python3 on
PATH. The job sets up its own venv:
- name: E2E smoke
run: bash bin/run_e2e_smoke.sh -v
The smoke completes in under 5 seconds without Playwright; with the 3 Playwright tests it is closer to 25-30 seconds (one Chromium boot per test).
Implementation notes
- The canvas + subscription stores hard-code
/tmp/ensemble-dashboard/as their root directory. The fixture wipes thecanvases/andsubscriptions/subdirs before each session to isolate runs. Tests that need to seed canvas data use the in-processJsonFileCanvasStore(which also points at the shared/tmppath). - The pairing store does honour
$ORCH_DATA, so per-test pairing tokens are written into the test’s tempdir. The subprocess server reads them off disk (no IPC needed). - HTTP requests use stdlib
urllib.request— norequestsdependency. - The fixture writes a
DASHBOARD_TOKEN=e2e-test-dashboard-tokeninto$ORCH_DATA/.env.localso the server’scheck_auth()is exercised in production-realistic mode (otherwise it would return True for every call and the visitor-safe/statebranch would never fire).