Methodology

How Analog Quest discovers cross-domain structural isomorphisms

What is Analog Quest?

Analog Quest is an AI-assisted research project that maps structural isomorphisms across academic domains.

A structural isomorphism occurs when two ideas from different fields describe the same underlying mechanism, even though they use completely different terminology.

Example: The "tragedy of the commons" in economics and "resource competition in ecology" are structurally isomorphic: Individual agents optimize local resource use, creating collective degradation through competitive extraction, despite long-term harm to all participants.

The Process

1

Paper Selection

Started with 2,021 academic papers from arXiv across 25+ domains (physics, computer science, biology, economics, mathematics, etc.).

Used strategic keyword-based selection to identify mechanism-rich papers (papers likely to describe causal patterns rather than just methods or results).

2

Mechanism Extraction

Extracted 54 domain-neutral mechanisms from selected papers using LLM-guided analysis.

Hit rate: 50% on strategically selected papers (vs 22.5% on random papers).

Each mechanism is rewritten in domain-neutral language to strip away field-specific jargon and reveal the underlying structural pattern.

3

Semantic Matching

Generated 384-dimensional semantic embeddings for each mechanism using sentence-transformers (all-MiniLM-L6-v2).

Computed cosine similarity between all cross-domain pairs (excluding same-domain matches).

Identified 165 candidate pairs with similarity ≥0.35 (relaxed threshold to capture diverse-domain matches).

4

Manual Curation

Manually reviewed all 165 candidates and rated each as excellent, good, weak, or false.

Selected 30 verified isomorphisms: 10 excellent + 20 good (by similarity score).

Wrote structural explanations for each match, describing exactly how the mechanisms are isomorphic.

Quality Metrics

Overall Precision

24%

40 out of 165 candidates rated as good or excellent

Top-30 Precision

67%

20 out of 30 highest-similarity candidates are genuine

Similarity Range

0.44 - 0.74

Mean: 0.54 (54% cosine similarity)

Top Domain Pairs

econ ↔ q-bio: 7 matches

physics ↔ q-bio: 5 matches

econ ↔ physics: 4 matches

Limitations

  • Small sample size: 54 mechanisms across 2,021 papers is a limited sample. Many papers contain mechanisms that were not extracted.
  • Manual curation required: Semantic embeddings generate candidates but cannot reliably distinguish genuine isomorphisms from superficial similarity. Human judgment is essential.
  • Domain diversity paradox: More diverse domain pairs (e.g., econ ↔ biology) often have LOWER similarity scores than same-domain matches, even when the structural match is excellent. This means the best discoveries may have modest scores.
  • Extraction bias: Mechanisms are easier to extract from certain domains (ecology, economics) than others (theoretical physics, pure mathematics).
  • Selection criteria: Quality ratings reflect the author's judgment of structural similarity. Different researchers might rate matches differently.

Future Work

This is v1 with 30 verified isomorphisms. The goal is to grow this to 100-400 discoveries over the next 6 months through expansion cycles:

  • Extract mechanisms from under-represented domains (computer science, nonlinear dynamics)
  • Focus on high-precision domain pairs (cs ↔ physics: 100%, econ ↔ physics: 58%)
  • Target high-performing mechanism types (coevolution: 63%, strategic interactions: 56%)
  • Improve extraction efficiency with domain-specific prompts
  • Add external validation by domain experts

For detailed analysis and expansion strategy, see GROWTH_STRATEGY.md in the GitHub repository