OpenAI DevDay: AI That Builds With You, Not For You
"The biggest advancement over the last year for us has been model progression, leading to significant leaps in intelligence and improvements in cost." That line from Caleb Hicks, CEO of SchoolAI, sums up what many builders felt at OpenAI DevDay. Higher IQ. Lower cost. New workflows that change who gets to ship software.
Education: AI tutors that actually fit into class time
Hicks' team is putting AI in students' hands as safe, managed, one-time personal tutors. Better models plus lower inference costs mean schools can deploy at scale without blowing the budget. The priority isn't a single model-it's orchestration: the right agent for the right task, stitched into the classroom.
Teachers aren't asked to become prompt engineers. They get a "GPS for impact"-real-time dashboards that surface engagement, progress, and where to intervene. If you've managed 42 desks across seven periods, quickly spotting the four students who truly need help is the difference between busywork and results.
Jam.dev: shipping UI fixes at the speed of thought
Danny Grant introduced "Please Fix," a browser extension that lets product managers, designers, and marketers edit a live site and generate a clean, design-system-compliant pull request with one click. Engineers get fewer drive-by tasks, and creative teams stop waiting in line.
The deeper idea: a move to software that's fluid and disposable. Browsing becomes a stream of intent-make a change, see it, ship it. In that world, subject matter experts build what they need, and AI handles the plumbing.
What this means for engineering and IT
- Let experts build, gate with code: Enable non-dev edits (copy, layout, minor UX) but enforce design tokens, linting, and PR checks before merge.
- Agentic workflows: Use specialized agents (explainers, reviewers, data fetchers) instead of one "do-everything" model. Route by task, not hype.
- Teacher/analyst DX over prompt fiddling: Deliver declarative tasks and guardrails; hide prompt glue behind APIs and templates.
- Cost control: Token budgets, per-user quotas, and batch inference for predictable spend. Log completion length and cache hits.
- Data safety: PII redaction, scoped context windows, and tenant isolation. Prefer retrieval over long prompts; expire context.
- Quality gates: Add LLM-evals, unit/regression tests, and human review to PRs created by AI-assisted tooling.
Implementation checklist
- Decide the edit surface: what non-engineers can change safely (copy, styles, JSON configs) vs. what stays in code.
- Wire a PR pipeline: branch naming, CI checks, visual diffs, and design token enforcement.
- Adopt agent routing: set up task routers, retrieval, and tool-use permissions per role.
- Observability: trace prompts, responses, tools called, latency, and cost per feature.
- Guardrails: content filters, prompt injection checks, and strict output schemas (JSON Mode).
- Feedback loop: thumbs up/down, rationale capture, and automatic fine-tuning or prompt updates on drift.
KPIs to track
- Lead time for change: idea to merged PR (by role).
- Engineer focus time: hours moved from low-priority edits to core work.
- Cost per resolved task: tokens, runtime, and review time.
- Quality: bug reopens, visual diffs variance, and eval pass rates.
- Adoption: active non-dev contributors and tasks completed without engineering intervention.
Where the stack is heading
Two trends stood out. First, the barrier to build is dropping: Agent builders and tools like "Please Fix" let domain experts ship software without deep coding. Second, software is becoming fluid-generated, customized, and discarded as context changes. The app is less a product and more a living interface to your current goal.
What to pilot this quarter
- Classroom or support agents: retrieval + tool-use + dashboards for targeted interventions.
- Live-site edit-to-PR flow: lock writes behind design tokens, run visual regression in CI, and require codeowner approval.
- Cost and safety policy: per-user budgets, PII redaction, and red-team prompts for injection and data leakage.
If you want to dig into API capabilities and guardrails, read the OpenAI API docs. For hands-on site editing and PR generation, explore Jam.dev.
Upskilling your team on AI-assisted workflows by role? See practical learning paths at Complete AI Training: Courses by Job.
The takeaway
Use AI to remove bottlenecks, not to add ceremony. Let the right people ship small changes fast, protect the edges with code, and measure the outcome. That's how you get compounding gains without chaos.
Your membership also unlocks: