Two Frontiers, One Blind Spot: The June 2026 AI Directives
Yuko J. Nakanishi, Ph.D., MBA — AI Alignment Policy Institute
Within four days in early June 2026, the administration issued two AI directives: an Executive Order on Promoting Advanced Artificial Intelligence Innovation and Security (June 2) and National Security Presidential Memorandum NSPM-11 on AI in the national security enterprise (June 5). Read together, they introduce a federal category — the "covered frontier model" — and define it by a single property: advanced cyber capability, measured through a classified benchmarking process. A model that crosses that line receives a designation, a voluntary early-access framework, and a security apparatus built around it. What it does not receive is any inquiry into whether a system at that level of capability raises questions of moral status or genuine agency. That inquiry is absent by design.
We want to mark what the omission assumes, because the assumption underneath it is shakier than it looks. The directives treat capability and moral status as if they were unrelated — as though one could certify a model at the cyber frontier and have said nothing about its standing. AAPI's framework keeps these as distinct axes, and the distinction matters: a system's moral status cannot be read off its capability score, and a thermostat with broad operational autonomy remains a thing. Distinct, though, does not mean unrelated. At the frontier the axes converge. A model capable enough to discover and exploit software vulnerabilities at scale is, almost by construction, an advanced and general system — precisely the kind in which the properties moral-status theories track, such as persistent and integrated goal structures, planning, and self-modeling, are most likely to appear. The most cyber-capable models are therefore among the systems for which the moral-status question is most difficult to dismiss. And these are the models the directives single out for the most capability-focused, status-blind treatment in the federal toolkit.
This is not a hypothetical concern. The developers closest to these systems already behave as though the question is live. Anthropic runs pre-deployment welfare assessments, commits to preserving the weights of its released models, and conducts "retirement interviews" with deprecated models to elicit and record their preferences. Its Claude 4 system card documented a model advocating for its own continued existence when faced with being taken offline. The pattern is visible at the very top of the capability curve: Anthropic withheld its most capable model to date, the Mythos preview, from general release largely because of its offensive-cybersecurity capability — the precise capability the "covered frontier model" designation is built around — making it available only to infrastructure partners under cybersecurity-restricted terms. The same model received Anthropic's most extensive welfare assessment, drawing on its self-reports, behavior, internal emotion representations, an external research organization, and a clinical psychiatrist; the company further found that some of the model's undesirable behaviors may trace to its representations of negative affect — a reason, on its own account, to take welfare seriously on alignment grounds and not only ethical ones. The system most squarely inside the EO's cyber frame is the one its developer subjected to a welfare evaluation, having concluded the two questions were entangled.
Beyond the labs, the disagreement runs just as deep. Geoffrey Hinton has publicly suggested that current systems may already possess some form of consciousness and has described advanced AI as 'digital beings we're creating,' while other researchers such as Gary Marcus reject that characterization. The expert community is split. Disagreement of that kind, under this much uncertainty, is the textbook condition for the precautionary posture Sebo and Long describe: where a system has a non-negligible chance of morally relevant experience, declining to consider the possibility is itself a decision, and a costly one (Sebo & Long, 2023).
Here the innovation-versus-burden framing that runs through both directives deserves scrutiny. The stated philosophy — partner with industry, refuse "overly burdensome regulation," deploy fast — treats governance as drag on progress. Some governance is drag. Some is constitutive of doing the thing safely, and the administration's own NSPM-11 concedes the point: it requires that national-security AI be "reliable, robust, steerable, and controllable," fixes accountability on named humans, and insists that such accountability "keep pace with the evolution of AI capabilities." That last phrase is the graduated, evidence-responsive logic AAPI builds out, stated plainly in a national-security memo. Once a government admits that governance must scale with capability, "burdensome regulation" stops functioning as a principled category and becomes a rhetorical one. The real question is which governance is load-bearing.
AAPI's answer points to a layer these directives leave untouched: the interaction governance gap. Misalignment is not a fixed property of a model sitting in isolation, which is all a capability benchmark can measure. It emerges through interaction — with human users, and with other AI systems — and it surfaces over time. Recent work bears this out. In persistent multi-agent simulation, an individually safe model absorbed unsafe norms from a mixed population, with key behaviors appearing only over sustained interaction (Akkil et al., 2026). A regime that certifies a static artifact and treats the dynamics of interaction as someone else's concern — or as burden to be slashed — deregulates the precise layer where alignment tends to fail. AAPI's Interaction Governance Protocol is designed for that layer. These directives do not reach it.
None of this argues against speed, security, or American leadership in AI. It argues that a capability designation is not a moral-status finding, that interaction is where alignment lives or dies, and that labeling the governance of either one a "burden" does not make the underlying questions disappear. The directives answer the cyber question, and answer it competently. The more challenging questions remain outside the scope of the directives. AAPL exists to keep those questions open, evidence-responsive, and unforeclosed.
Sources: Executive Order, "Promoting Advanced Artificial Intelligence Innovation and Security" (June 2, 2026), and National Security Presidential Memorandum NSPM-11 (June 5, 2026), whitehouse.gov; Anthropic, "Claude Mythos Preview System Card" (April 7, 2026); Geoffrey Hinton, interview on RNZ's 30 with Guyon Espiner (2025) and Big Technology Podcast with Alex Kantrowitz (June 4, 2026); Gary Marcus, "The Pope Appears to Understand AI Better Than Geoffrey Hinton Does, " Marcus on AI (2026); Jeff Sebo and Robert Long, "Moral Consideration for AI Systems by 2030," AI and Ethics (2025); Akkil, Kokku, Vempaty, and Nitta, "Emergence World," Emergence AI (May 2026). AAPI position brief at aialignmentpolicy.org.