GPT 4 Can Hack Through Security Flaws Automatically

In a recent study, researchers managed to hack into over half of their test websites using teams of GPT-4 bots working autonomously. These bots coordinated their attacks and created new bots as needed. Impressively, they exploited real-world “zero-day” vulnerabilities—previously unknown security flaws—during their attempts.

A few months ago, a research team published a paper demonstrating that they had used GPT-4 to autonomously exploit one-day (or N-day) vulnerabilities. These are known security flaws for which no fixes are yet available. Remarkably, when given the Common Vulnerabilities and Exposures (CVE) list, GPT-4 managed to exploit 87% of the most severe vulnerabilities on its own.

Fast forward to this week, and the same researchers have released a follow-up study. This time, they successfully hacked zero-day vulnerabilities—entirely unknown security flaws—using a team of autonomous, self-replicating Large Language Model (LLM) agents. They achieved this with a method called Hierarchical Planning with Task-Specific Agents (HPTSA).

ALSO READ

OpenAI Forms Safety Team to Train New GPT Model

Rather than having one large language model (LLM) agent handle multiple complex tasks, the Hierarchical Planning with Task-Specific Agents (HPTSA) approach employs a “planning agent” to oversee the entire process. This planning agent acts like a boss, coordinating with a “managing agent” that supervises and delegates tasks to various “expert subagents.” Each subagent specializes in a specific task. This method reduces the burden on a single agent, allowing it to focus on what it does best while the subagents tackle specialized tasks more effectively.

When tested against 15 real-world web vulnerabilities, the Hierarchical Planning with Task-Specific Agents (HPTSA) method proved to be 550% more efficient at exploiting these vulnerabilities compared to a single large language model (LLM) agent. HPTSA successfully hacked 8 out of the 15 zero-day vulnerabilities, whereas the solo LLM managed to hack only 3 of the 15.