One in five organizations has experienced a serious security incident directly tied to AI-generated code. As AI assists more of the software development lifecycle, chief information security officers need audit strategies that measure developer practices, govern AI tool usage, and identify software risks before they reach production.
The audit must deliver visibility into the agentic development lifecycle (ADLC) - who is using AI, what tools they use, and where AI-generated code appears. That visibility remains elusive because individual developers pick their own LLM tools, each with wildly different security proficiency. For CISOs, this makes reporting quantifiable risk to stakeholders difficult and blocks enforcement of governance policies.
The new operational risk inside the SDLC
Security industry research has demonstrated a range of outcomes when comparing humans and machines on specific security tasks. The best LLMs perform comparably with proficient professionals only for a limited set of secure coding tasks, such as flagging code smells and anti-patterns. The same tools struggle with denial-of-service protection, insufficient logging, and misconfigured permissions. Top security-proficient developers will outperform LLMs, and average developers will not.
This gap creates a category of operational risk that originates inside the SDLC rather than from external attackers. Unintentional developer actions widen the visibility gap at a time when tracing accountability is already hard. To report quantifiable risk, CISOs need to include three variables in any audit: AI deployment (who uses which tools, how often, and where), developer capabilities (who can spot LLM-introduced vulnerabilities and who needs upskilling), and vulnerability assessments (what went wrong, at which stage, and how damaging it was).
With those variables, boards get answers to essential questions: Where is AI increasing risks? Which teams or behaviors drive those risks? Do teams have the skills to deploy AI routinely and safely?
Four stages of an effective audit
Record tool usage. Compile a verifiable record of all AI and LLM assistants deployed for code generation - sanctioned or not - and map them directly to code outputs. This gives CISOs the traceability needed for audit readiness and emerging regulatory directives, and it is a critical step to govern Generative Code.
Evaluate and benchmark tools, then fix issues. Gauge AI models against known vulnerability patterns. Standardize on tools that produce secure output and use the results to determine approved tool selection. Track model context protocol integrations to ensure AI agents connect only to approved tools and data sources. "Time travel" auditing can instantly isolate and fix every commit linked to a compromised LLM model, avoiding the cost of lengthy manual code reviews.
Invest in upskilling. Beyond continuous education and benchmarking, assign a risk score - similar to a credit score - that weighs multiple factors to show how much unintentional risk each developer creates, based on their skills, practices, and oversight capabilities.
Link AI to business goals. Insights from audits must connect AI for IT & Development tool deployment with productivity, code quality, and secure outcomes. This connection informs decisions about which tools to invest in and how to balance innovation with risk management.
Why this matters for IT and development
For IT and development teams, the audit process directly pinpoints which AI tools and developer behaviors create the most risk. It surfaces exactly where upskilling is needed - often among average developers who cannot catch LLM-introduced flaws - and it ties tool adoption to measurable business outcomes. Starting with a thorough audit lets the right people use the right tools without delegating too much to AI, keeping the SDLC productive and safe.
Your membership also unlocks: