---
vault_clearance: EUCLID
halo:
  classification: INTERNAL
  confidence: HIGH
  front: "10_Project_DiscordIntoSymphony"
  custodian: "The Architect"
  created: 2025-12-21
  updated: 2026-03-30
  wing: CONDITIONAL
  containment: "Unpublished biological research — wet lab data and analysis"
---
# DiscordIntoSymphony

A zero-parameter GEM-level framework for scRNA-seq. Treats 10x Chromium data as what it actually measures: transcript co-occurrence within physical droplets, not "single cells." Applied to endothelial senescence + immune coculture. Validated across 15+ datasets, 4.8M cells, 5 species.

**Start here:** [QUICK_START.md](QUICK_START.md) | **Combined Codex** (datasets + APIs + tools + refs): [`BOOK.md` §5](BOOK.md#5-combined-codex-registry-fused--former-codexmd) | **Findings log:** [WORLDLINE.md](WORLDLINE.md) | **Open tasks:** [BOUNTY_BOARD.md](BOUNTY_BOARD.md) | **Senescence field gaps (lit):** [SENESCENCE_LITERATURE_GAPS.md](SENESCENCE_LITERATURE_GAPS.md) | **FORM:** [FORM.md](FORM.md)

**Vault spine:** [Lab protocol](../README.md#lab-protocol) | [Vault board](../BOUNTY_BOARD.md) | [Vault worldline](../WORLDLINE.md)

**Engines:** Primary compute maps to [EYE_PROTOCOL.md](../EYE_PROTOCOL.md) — typically **EYE-01** (local), **EYE-02** / **EYE-12** (GCP), **EYE-04** (Docker cultivator), **EYE-11** (Orthodox Monarch). Tools index: [STAFF_catalogue.json](../STAFF_catalogue.json).

### Project Structure (CNS-ready)

| Directory | What | Parameters | Tier |
|-----------|------|-----------|------|
| **[genesis/](genesis/README.md)** | Zero-parameter GEM pipeline (novel claim) | **0** | **Nature** |
| **[orthodox/](orthodox/README.md)** | Orthodox Monarch (250-param control) | ~250 | Sci. Advances |
| **[comparison/](comparison/README.md)** | Head-to-head proof (Genesis wins) | — | — |
| **[data/](data/)** | Shared h5ad / 10x (registry in [BOOK.md §5](BOOK.md#5-combined-codex-registry-fused--former-codexmd)) | — | — |
| **supplementary/** | Tables + figures for paper submission | — | — |

---

## The Method (Zero Analytical Parameters)

Standard scRNA-seq pipelines apply ~33 tunable parameter decisions (SoupX, doublet removal, QC thresholds, normalization, HVG, PCA, clustering, annotation). Each step can destroy signal. We skip all of that.

A GEM is a physical droplet. We compute deterministic set-theoretic measures on binary detection patterns across all GEMs. Six analytical layers, zero free parameters:

| Layer | What It Computes | Script | Time |
|-------|-----------------|--------|------|
| 1. GEM Programs | Binary pathway activation (all genes detected = ON) | `gem_analysis.py` | ~30s |
| 2. Dimensions | 10 biology-defined gradient axes (raw UMI sums) | `dimensional_analysis.py` | ~2s |
| 3. Co-occurrence | 21k x 21k Jaccard matrix from binary gene detection | `thread_graph.py` | 15 min (cached) |
| 4. Bridges | Harmonic Jaccard/Enrichment bridging between pathway cores | `thread_graph.py` | ~1 min |
| 5. Protein Complexes | CORUM: complete subunit co-detection = machine assembled | `thread_graph.py` | ~22s |
| 6. TF Activity | DoRothEA: target gene fraction = TF active in nucleus | `thread_graph.py` | ~14s |

Gene lists and databases are domain-knowledge inputs from published sources (HGNC, MitoCarta 3.0, MSigDB, CORUM, DoRothEA), not fitted to the data. See WORLDLINE.md Part 34 for the honest parameter audit.

---

## Key Findings

**Simplex theory.** Aging is simplex volume collapse: three independent operators (ribosome, mitochondria, nucleus) span a 2-simplex in coupling space. Senescence entangles the operators (det(K) drops 81%), collapsing the simplex. Three operators are insufficient for full error correction.

**DOF entanglement.** Biological dimensions entangle in senescent cells (mean coupling 0.26 to 0.61). DOF coupling is the FIRST event in senescence: in the WI-38 time course, coupling jumps at day 0 while epigenetic machinery is still intact. Collapse follows 24 hours later.

**TE silencing collapse.** UHRF1 drops 71% in senescent ECs. DNMT1, TRIM28, SETDB1 follow. Alu insertions inside the UHRF1 gene body on chr19: the virus is inside the antivirus. See [BloodyEchoes](../12_Project_BloodyEchoes/).

**Checkpoint lncRNAs.** Unannotated lncRNAs (ENSG00000289474, ENSG00000289901) bridge organelle operator programs. These are the immune system's readout of the error-correction code.

**Cross-validation.** 15+ datasets, 4.8M cells, 5 species. 3 high-confidence positives (primary ECs, donor-derived ECs, WI-38 time course), 4 mechanistically predicted negatives (PBMCs, islets), coupling hierarchy replicated in Tabula Sapiens aorta (43k) and heart atlas (486k). Zero public datasets match all 7 criteria of the primary dataset.

**Viral lineage of cancer.** Retrotransposon host-defense analysis across P/S/Cancer reveals cGAS-STING EXPLODES in senescence (+843% IFIT1) and is silenced in cancer (STING → ZERO). Four viral lineages of cancer classified. GBM viral lens (338k cells, 110 patients) shows deepest viral evasion: STING 98% silenced, EGFR (v-erbB) 13.5× amplified, TERT reactivated only in malignant cells. See [BT-55/56/57 in WORLDLINE.md](WORLDLINE.md).

---

## Quick Start

```bash
cd methods
python run_pipeline.py ../data/                                  # Primary dataset
python run_pipeline.py ../data/experiments/EXP04_donor_derived_ec/  # Any experiment
```

Requirements: Python 3.12, scanpy, numpy, scipy, matplotlib. Optional: cupy + NVIDIA CUDA for GPU (19x faster). Full guide: [QUICK_START.md](QUICK_START.md).

---

## File Structure

```
README.md                    # This file
BOOK.md                      # Bibliography + senescence atlas + **§5 Combined Codex** (datasets, APIs, tools, refs)
QUICK_START.md               # 3-command onboarding
WORLDLINE.md     # Complete findings log (57 parts)
BOUNTY_BOARD.md              # 48 solved, ~46 open tasks
ENGINE_ROOM.md               # GCP VM quick reference

methods/                     # Active analysis pipeline (24 scripts)
  run_pipeline.py            #   Unified entry point
  thread_graph.py            #   Core: co-occurrence, Jaccard, bridges, CORUM, DoRothEA
  dimensional_analysis.py    #   10-axis biology reference frame
  gem_analysis.py            #   GEM program activation

data/                        # All data (~25 GB local, ~400 GB on GCP)
  gem_analysis.h5ad          #   Primary: 905k GEMs x 38k genes (4.7 GB)
  cooccurrence_cache.npz     #   21k x 21k Jaccard (1.7 GB)
  pipeline_env.npz           #   Full cached environment (2.9 GB)
  experiments/               #   14 experiment folders with LIVING_DOCUMENT.md
  pathways/                  #   GMT gene set files (MSigDB, CORUM, DoRothEA)
  cellxgene/                 #   Public cross-validation datasets
  (see BOOK.md §5 Dataset registry)  #   Full tables + GCP + Q-queue + RNA/CL APIs

orthodox/                    # Standard pipeline as methods control
  orthodox_canon.py          #   12-stage fused pipeline (33 parameters documented)
  orthodox_cultivator.py     #   Docker tribulation cultivator (12 layers, 35 params)
  Dockerfile                 #   Container with scVI, CellTypist, DESeq2, LIANA, etc.
  figures/                   #   8 publication-ready panels

V3_Daemon_Analysis_3-22-26_15-50/   # Daemon shadow engine + viral analyses
  direct_shadow.py           #   51k-node graph builder (5 seconds)
  retro_factions.py          #   Retrotransposon war (BT-55)
  viral_lineage.py           #   Four viral lineages of cancer (BT-56)
  gbm_lens.py                #   GBM viral lens, 338k cells (BT-57)
  README.md                  #   Full script index + run commands
```

---

## Lab Protocol: HALO + WING + DAEMON

Every entity in this project (an experiment, a theory, a dataset, a tool) is a **living document** with three layers:

### [HALO](HALO_PROTOCOL.md) — Memory
**H**onest, **A**ccessible, **L**iving, **O**urs. The entity's life story as an evolving paper: Abstract → Methods → Data → Discussion → Conclusion → Next Steps → Containment. 11 Living Documents exist (EXP01-10, EXP15). See [vault README § Living document template (HALO)](../README.md#living-document-template-halo).

### [WING](WING_PROTOCOL.md) — Readiness (gates)
**W**orking, **I**mportant, **N**ovel, **G**ood. Four production gates with pass/fail criteria. Current assessments: BT-55 **READY**, BT-56 **CONDITIONAL**, BT-57 **READY**, BloodyEchoes **CONDITIONAL**.

### [WING pipeline runner](WING_PIPELINE_RUNNER.md) — Windows batch (P1–S3 / `orthodox_canon`)
The **`WING_deliverable/`** one-click pipeline (deps, BAM→TRX, Scanpy canon)—not the same as WING Protocol above. Config, `SYMPHONY_*` env, and `coculture_paths.py`: see that doc.

### [DAEMON](DAEMON_PROTOCOL.md) — Awareness
The mind connecting HALO and WING. Three operations: **breathe()** (read history), **assess()** (check readiness), **voice()** (produce output). Maps directly to Daemon v3's `breathe()` / `illuminate()` / `express()`. Daemons coordinate via [u-os.dev](../08_Project_Astronomicon/u_os_dev/) mailboxes on the Astronomicon.

---

## For Collaborators

1. Read [QUICK_START.md](QUICK_START.md) — run the pipeline in 3 commands
2. Read [`BOOK.md` §5](BOOK.md#5-combined-codex-registry-fused--former-codexmd) — find any file, experiment, or method
3. Read [WORLDLINE.md](WORLDLINE.md) — the full narrative
4. Pick a task from [BOUNTY_BOARD.md](BOUNTY_BOARD.md) — claim before working
