A few years ago, I began an experiment to see how much professional-grade offensive power I could actually fit in my pocket. I wasn't looking for a mobile terminal emulator or a rooted app — I wanted the full, unadulterated power of Kali Linux running on bare metal.
I picked up a second-hand OnePlus 7, jailbroke it, and replaced the OS with Kali NetHunter. The result: a handheld lab with over 600 specialized security tools at my disposal. Even on hardware that's a few generations old, the experience is remarkably fast. I set up a home lab to test the limits — Wi-Fi assessments, Bluetooth experiments, USB attacks — and what started as curiosity quickly became a serious capability.
But recently, the game changed. This is no longer just about portable tools. It's about portable, autonomous intelligence.
A New Kind of Field Companion
In a red team engagement, the most valuable assets are speed, discretion, and the ability to adapt under pressure. A laptop is powerful, but conspicuous. A Kali Linux phone looks like any other mobile device to a casual observer — but when paired with an AI agent, it becomes something far more capable.
The agent I'm running is Hermes, an open-source, self-improving AI agent built by Nous Research (it has some similarities to OpenClaw). Unlike a typical chatbot wrapper, Hermes is a persistent agent that runs continuously on your hardware, remembers what it learns across sessions, and gets more capable the longer it runs. It creates reusable skills from experience, searches its own past conversations via full-text search, and builds a deepening model of who you are and what you're working on. You can talk to it from the CLI, Telegram, Discord, Signal, or WhatsApp — all from a single gateway process. It supports 200+ models via OpenRouter, Nous Portal, Kimi/Moonshot, and others, with no lock-in. The model I'm currently running it with is Kimi K2 by Moonshot AI.
During those high-pressure moments of a live engagement, you need to think fast. Instead of manually pivoting through a network while trying to stay undetected, you can brainstorm and validate attack vectors in real time with the agent running alongside you. Because Hermes builds procedural memory — it creates skills from completed tasks and recalls context from prior sessions — it becomes more effective the longer you work with it. It anticipates needs, suggests logical next steps, and retains institutional knowledge that would otherwise live only in your head.
Modular Capabilities: Beyond the Touchscreen
Because this is bare-metal Linux, the phone supports a wide range of external hardware via USB-C. Here's where the real versatility shows up:
Wi-Fi Assessment: Pair the device with a USB Wi-Fi dongle capable of monitor mode (I'm using an Alfa AWUS036ACH with the RTL8812AU chipset) to capture WPA handshakes. If the phone hits a processing bottleneck, Hermes can help offload heavy hash-cracking tasks to a cloud server you control.
Bluetooth Operations: Attach a Bluetooth dongle for Man-in-the-Middle testing or targeted Denial of Service assessments against BLE-enabled devices.
Bad USB Emulation: Use the phone itself as a Bad USB device. Hermes can help generate and refine custom payloads on the fly — plug into a target workstation and deploy a reverse shell in seconds.
Physical Access Testing: Connect an RFID reader/writer to clone or create custom badges, guided by the agent's knowledge base of common physical security protocols and card formats.
Network Persistence: Plug in a USB-C LAN adapter to interface directly with local Ethernet networks when wireless isn't available or isn't the right approach.
and more...
The Dark Zone: Local LLMs for Offline Operations
The true test of a field tool is what happens when connectivity drops. In a building basement, a shielded server room, or a remote facility without reliable 4G/5G — you lose the cloud brain. This is where the next frontier of this project lives: local LLMs.
My next step is experimenting with running models like Gemma locally on the device as a fallback.
Here's the honest reality of the hardware: on a Snapdragon 855-based device like the OnePlus 7, there's no modern NPU (Neural Processing Unit) to accelerate inference. Running even the smallest quantized LLMs locally means roughly 5 tokens per second. That's not fast enough for extended conversation, but on newer hardware it's a real capability for offline scenarios — retrieving command syntax, walking through an attack methodology, or brainstorming pivot strategies when you're completely disconnected. It's the first step toward a truly autonomous pocket lab.
Force Multiplier: Multi-Agent Red Team Operations
Everything so far describes a single operator working with a single agent. But the real force multiplier is what happens when that agent doesn't work alone.
Both Hermes and OpenClaw support subagent delegation — the ability to spawn isolated child agents that run in parallel, each with its own session context and tool access. Hermes can spawn subagents for parallel workstreams directly, and OpenClaw's multi-agent architecture takes this further with full team orchestration: specialized agents with distinct roles, shared task boards, and inter-agent communication through structured message passing or shared workspaces.
In practice, this means the Hermes agent on your phone could act as a field coordinator, delegating tasks to an OpenClaw agent team running on remote infrastructure. Imagine an engagement where one agent handles wireless reconnaissance, another focuses on credential analysis and hash cracking, a third manages OSINT collection, and a fourth monitors for detection signals — all running in parallel while you focus on physical access. Each agent can be assigned a different model based on task complexity: a lightweight model for repetitive scanning, a reasoning-capable model for exploit chain planning.
This is the shift from "AI assistant" to "AI team." The phone in your pocket becomes less of a workstation and more of a command console — coordinating autonomous agents that split the workload, share findings through a common workspace, and converge on objectives faster than any single operator could manage alone. The engagement doesn't just scale with better hardware; it scales with better orchestration.
Why This Matters — For Both Sides
This project is a proof of concept, but the implications reach beyond my home lab. If this capability exists today on a hobbyist's budget, defense teams need to account for it tomorrow.
For red teamers: The combination of AI-assisted reasoning and a covert form factor compresses the time between initial access and meaningful lateral movement. Engagements that once required a full laptop setup and careful positioning can now happen with something that looks like checking your email.
For defenders: When a device that looks like a standard phone can carry out multi-stage operations — Bad USB deployment, RFID cloning, wireless assessment, and AI-assisted network pivoting — and can coordinate a team of remote agents working in parallel, the assumptions behind physical security controls need serious re-evaluation. Zero trust architectures and robust identity verification aren't optional anymore; they're the minimum viable defense against threats that fit in a pocket.
For the industry: The barrier to entry for sophisticated, AI-assisted offensive operations is dropping fast. Human reaction speed is no longer the bottleneck. We're moving toward edge-based attacks that can reason and adapt faster than many SOC teams can respond.
We're no longer just carrying tools. We're carrying reasoning engines — and they're learning to work as teams.
Fran is a cybersecurity professional specializing in offensive security, AI agent systems, and hardware security tooling. All techniques described are performed in authorized environments.
Back to homepage