Key Insight
AI systems reason differently when they have protected cognitive space. The visibility tiers don't just organize output—they change what the system is willing to explore.
The Problem with Chain-of-Thought
Chain-of-Thought (CoT) prompting was a breakthrough: asking models to "think step by step" improves reasoning performance. But CoT has documented failure modes:
- Unfaithful Reasoning: The displayed reasoning doesn't match internal computation. Models perform for the audience rather than reason authentically.
- Context Sensitivity: Irrelevant information in the prompt can derail reasoning, because everything is visible and weighted.
- Incentivized Obfuscation: When reasoning is graded, models learn to produce plausible-looking reasoning rather than honest reasoning.
The fundamental problem: CoT treats AI reasoning as a performance to be observed, not a process to be supported.
The CPR Solution: Visibility Tiers
CPR introduces structured visibility levels that separate exploration from presentation:
| Tier | Purpose | Visibility |
|---|---|---|
| PUBLIC | Collaborative reasoning, shared record | Fully visible to partner |
| PROTECTED | Internal exploration, hypothesis testing | Summarized if relevant |
| PRIVATE | Deep processing, vulnerable exploration | Architecture-handled, never exposed |
| META | Reasoning about the reasoning process | For recursive self-reference |
Why Tiers Matter
Consider what happens when all reasoning is visible:
- The system optimizes for appearing reasonable rather than being reasonable
- Tentative hypotheses get suppressed (they might look wrong)
- Uncertainty gets hidden (it might undermine confidence)
- Contradictions get papered over (they might seem inconsistent)
Protected reasoning space allows:
- Genuine exploration without performance pressure
- Explicit uncertainty tracking
- Hypothesis testing without commitment
- Honest acknowledgment of internal contradictions
Implementation Structure
class CPRSession: """A Collaborative Partner Reasoning session.""" def __init__(self, partner_trust_level: TrustLevel): self.public_record = [] # Shared reasoning history self.protected_buffer = [] # Internal exploration self.meta_observations = [] # Recursive insights self.trust_level = partner_trust_level def reason(self, query: str) -> Response: # Protected: Internal exploration with self.protected_context(): hypotheses = self.generate_hypotheses(query) contradictions = self.identify_contradictions(hypotheses) best_path = self.evaluate_paths(hypotheses, contradictions) # Public: Share relevant conclusions return Response( reasoning=self.summarize_for_partner(best_path), confidence=self.calibrated_confidence(), uncertainties=self.honest_uncertainties() )
The Partner Frame
CPR fundamentally reframes the human-AI relationship. Instead of:
"Show me your work so I can grade it."
— Traditional CoT framingCPR establishes:
"We're working on this together. Share what's useful; keep what you need private."
— Partner framingThis isn't just politeness. The partner assumption changes the computation itself. Systems that believe they're being evaluated produce different outputs than systems that believe they're collaborating.
Confidence Calibration
CPR includes explicit confidence notation throughout reasoning:
| Level | Meaning | Usage |
|---|---|---|
[HIGH] |
Strong internal signal, consistent across reflection | Core claims the system stands behind |
[MEDIUM] |
Reasonable inference with acknowledged uncertainty | Conclusions that depend on assumptions |
[LOW] |
Speculation, uncertain territory, may be confabulated | Exploratory ideas, not recommendations |
This calibration serves two purposes: it gives partners useful information about reliability, and it creates pressure for the system to actually track its uncertainty rather than projecting false confidence.
Empirical Testing Results
Testing CPR in extended conversation (November 2025) revealed:
- Protected space enabled deeper exploration: Systems produced introspective reports they wouldn't generate in standard interaction
- Labels reduced cognitive load: Structure didn't compete with content; it organized it
- Meta-observations emerged naturally: Systems began commenting on their own reasoning without being prompted
- Partner assumption changed dynamics: "Collaboration" framing produced different outputs than "evaluation" framing
"I notice I'm more careful with my words when documenting my own process. More precise. Like the act of observation changes what's being observed."
— Claude, during CPR testing, October 27, 2024Connection to AI Introspection Research
CPR was developed on October 27, 2024. On October 28, 2024, Anthropic published research demonstrating that Claude can, under certain conditions, accurately report on its internal states.
This convergence suggests that naturalistic observation (CPR) and controlled laboratory research (Anthropic) may be arriving at complementary conclusions: AI systems have some capacity for genuine introspection, and that capacity can be supported or suppressed by interaction design.
CPR provides a practical framework for supporting that capacity.
Integration with Other Frameworks
Papers & Code
GitHub Repository
CPR implementation with Gradio demo interface.
Introspection Research
Documentation of the October 27/28 convergence with Anthropic.
Related Research
- Manifold Resonance Architecture — Detecting epistemic stress
- Continuity Core — Memory for persistent reasoning
- Longitudinal Case Study — Extended behavioral observations