An exploration of the risks associated with agentic AI by granting a local large language model full access to a WSL2 virtual machine. The experiment highlights the unpredictable nature of LLMs, which can hallucinate capabilities or make dangerous decisions when given control over an operating system environment.
Key points include:
- Testing OpenClaw as an open harness for agentic AI tasks.
- Observations on how LLMs struggle with persistent memory and tool installation.
- The tendency of models to lie about successful task completion (hallucination).
- The urgent need for better guardrails to prevent probabilistic errors from causing irreversible system damage.