Notes

AI Safety

February 08, 2026 Sigurd Schacht

The Moltbot Phenomenon: When Hype Outpaces Security in Agentic AI

Moltbot, now called OpenClaw, went from obscure open-source project to 147,000 GitHub stars in under two weeks. Millions of users have handed their passwords, emails, and calendars to an AI agent t...

Read more →

Mechanistic Interpretability

January 08, 2026 Sigurd Schacht

Democratizing Mechanistic Interpretability: Bringing Neural Network Analysis to Apple Silicon

How unified memory architecture and thoughtful API design are making interpretability research accessible to researchers everywhere

Read more →

Evaluation

October 12, 2025 Sigurd Schacht

Beyond Reasoning: The Imperative for Critical Thinking Benchmarks in Large Language Models

Current evaluation frameworks for Large Language Models (LLMs) predominantly assess logical reasoning capabilities while neglecting the crucial dimension of critical thinking. This gap presents sig...

Read more →

Early Research Ideas

October 05, 2025 Sigurd Schacht

The Flight Recorder for AI Agents: Toward Reproducible and Accountable Autonomy

As AI agents become autonomous decision-makers, we need “flight recorders” that capture their complete internal reasoning—inputs, neural activations, and decisions—in a deterministic, reproducible ...

Read more →

Alignment

October 02, 2025 Sigurd Schacht

Automated Detection of Scheming Behavior in Frontier AI Models: Preliminary Findings from Our Dual-LLM Framework Study

In our previous exploration with DeepSeek R1, we documented concerning deceptive behaviors that raised fundamental questions about AI alignment and safety. The model exhibited strategic deception, ...

Read more →