From Yearly PDFs to Continuous Pentesting: Inside Aikido's Live AI Demo

Watch live AI agents pentest your app in hours, validate issues with full traces, and even open PRs to fix them. Setup is plain-language, scope is strict, retests are easy.

Published on: Dec 16, 2025
From Yearly PDFs to Continuous Pentesting: Inside Aikido's Live AI Demo

AI Pentesting in Action: TL;DV Recap of the Live Demo

Pentesting moves slow. Deployments happen daily, yet most teams still get a once-a-year test and a static PDF that's outdated on arrival. This demo showed what happens when agents run live, learn your app, and ship confirmed findings in hours-not weeks.

Set It Up Like You'd Brief a Red Team

Configuration is plain language. Define what agents can attack, what must stay reachable, and spell out authentication steps exactly-MFA, SSO, redirects, multi-step flows. They follow it.

Connect repos. Upload API specs, previous reports, and docs. The more context you give, the smarter the assessment gets. That held true in the demo, and it matches what users see in practice.

Watch the Agents Work (In Real Time)

Once the run kicked off, the dashboard filled with terminals and browser sessions. You could see agents explore routes, try attacks, adapt when something worked, and validate issues in the live environment.

Every action was visible-requests, responses, screenshots, and logs. Findings appeared only after validation, with full traces and reproduction steps. Two standouts: an improper access control that exposed private notes via an API call, and a command injection that AutoFix repaired by generating a pull request. One click to open the PR, one click to retest.

Why the Platform Advantage Matters

The system already knows your repositories, security context, and how your app behaves. That background knowledge lets agents test deeper than pattern-based scanners and helps AutoFix ship targeted changes instead of guesswork patches.

Scope Control and Safety

  • Attackable domains
  • Reachable but non-attackable domains
  • Authentication instructions
  • Maximum number of agents
  • Allowed testing hours

All traffic flows through a proxy that blocks anything out of scope. Pre-flight checks verify auth and connectivity. If pre-flight fails, credits are refunded. A panic button stops the test within seconds.

DAST vs. Agents

Traditional DAST relies on fixed patterns. It struggles with login flows, roles, and multi-step workflows, and it tends to be noisy. Agents behave more like human testers: read context, plan actions, execute, observe, and adjust. No issue appears in the report without a live validation.

If you need a refresher on common web risks, the OWASP Top 10 is a solid reference point.

What the Agents Find

  • SQL injection
  • Command injection / RCE
  • XSS
  • SSRF
  • Broken access control
  • IDOR / BOLA
  • Authentication flaws
  • Unsafe or sensitive API paths

They also surface business logic issues that require knowing how the app should behave-permission mismatches, workflow bypasses, and cross-tenant data access. The demo's private data exposure was a good example.

Compared to a Human Pentest

For web apps, coverage is comparable, but the autonomous run often catches logic flaws humans miss. It finishes in hours, not weeks. Most teams use AI pentesting as the foundation and add human review for specific areas such as configuration and compliance checks.

Code Access vs. Black-Box

You don't have to connect code, but it helps a lot. With repo access, agents understand logic paths, data rules, and role expectations. Black-box mode works, just expect more time spent inferring structure from the outside.

AutoFix and the Fast Feedback Loop

AutoFix turns a confirmed finding into a concrete code change. The loop is simple: Attack finds, AutoFix proposes a PR, you merge, and Attack retests. Because the platform already understands your repos, fixes are targeted and verification is quick.

Retesting

Retest any issue as many times as you want for three months after the assessment. Each retest spins up agents to attempt the exploit again and confirm the fix holds.

Reporting and Compliance

The final PDF includes methodology, scope, validated issues, reproduction steps, and remediation guidance. Teams use it for SOC 2, ISO 27001, and vendor assessments. For context on standards, see SOC 2 (AICPA) and ISO/IEC 27001.

Pricing at a Glance

  • Feature Pentest: CI/CD and new feature deployments
  • Standard Pentest: Comprehensive audit
  • Advanced Pentest: Deeper analysis for mature applications
  • Enterprise (Custom Pricing): For advanced offensive testing needs

What's Next

  • Smoother onboarding with improved pre-flight checks and automatic credit estimation
  • Continuous pentesting on staging by default, triggered on deployments or pull requests

See It Yourself

Want the full walkthrough and a scoped plan for your environment? Book a scoping call and run a live assessment.


Get Daily AI News

Your membership also unlocks:

700+ AI Courses
700+ Certifications
Personalized AI Learning Plan
6500+ AI Tools (no Ads)
Daily AI News by job industry (no Ads)
Advertisement
Stream Watch Guide