Featured

49:12

Humanity's Last Exam - A Fireside Chat

Vijay Karunamurthy, Summer Yue & Dan Hendrycks

57:00

AI Readiness Report 2024 - Insights and Strategies for Enterprise Success

30:06

AI Reimagining Qatar's Cultural Experience

Asma Al-Jefairi, Norah Abokhodair, Salma Soliman & 1 content:more content:speaker

Making AI Work: How to Build and Scale Long-Running Enterprise Agents

J2: Jailbreaking to Jailbreak

Caton Lu, Jeremy Kritz & Sail (Zifan) Wang

Blueprint to Agentic AI: Identify, Build, Deploy

Sahil Bhaiwala

Meet Donovan: The AI Digital Staff Officer for the Future

Benjamin Youngs & Kathyrn Harris

All Content

All Tags

All Types

Making AI Work: How to Build and Scale Long-Running Enterprise Agents

Caton Lu, Jeremy Kritz & Sail (Zifan) Wang · May 13th, 2025

J2: Jailbreaking to Jailbreak

About the Tech Talk Join Scale AI researchers as they present their paper, “Jailbreaking to Jailbreak (J2),” which introduces a new paradigm in automated safety testing for large language models. This webinar will walk through the J2 approach—an LLM trained to systematically red-team other LLMs using a structured, multi-turn attack pipeline that mirrors the creativity and adaptability of human red-teamers. Through in-depth exploration of methodology, model behaviors, and empirical results, we’ll show how J2 achieves red-teaming performance on par with humans, offering a scalable, cost-effective alternative for vulnerability discovery. We'll also discuss implications for safety research, limitations of current defenses, and what this means for the future of alignment and model control. Key takeaways Learn how a capable LLM can be trained to jailbreak other models - and itself - by simulating human-like attack strategies. Understand J2’s multi-stage pipeline for planning, executing, and evaluating adversarial interactions with other LLMs. Explore how this hybrid approach outperforms traditional automated red-teaming methods and approaches the success rate of professional human testers. Gain insight into the emerging risks of scalable, self-reasoning attack agents and what this means for the next generation of AI safety defenses. Check out the paper here: https://scale.com/research/j2

Sahil Bhaiwala · Apr 30th, 2025

Blueprint to Agentic AI: Identify, Build, Deploy

Hype around Generative AI continues to grow, especially around agentic solutions, but most enterprises and governments struggle to implement impactful Generative AI. Its true value lies in how well it's directed—knowing how to set the right goals, boundaries, and feedback loops is key. When used thoughtfully, generative AI can become a powerful force multiplier, turning ideas into action with minimal oversight. Join Scale's Director and General Manager of International Public Sector, Sahil Bhaiwala for a tech talk exploring what it takes to build agentic solutions with accuracy.

Benjamin Youngs & Kathyrn Harris · Apr 11th, 2024

Meet Donovan: The AI Digital Staff Officer for the Future

About the Tech Talk Generative AI holds transformative potential for the U.S. government and is laying the foundation for unprecedented advancements across the public sector. Join senior leadership from Scale’s Public Sector team, Kathryn Harris and Benjamin Youngs, to meet Donovan, the AI Digital Staff Officer for the future and learn: - How Donovan uses LLMs to act as a force multiplier and accelerate your traditional workflows, giving you more time to do the work that matters the most - How Donovan can be used to uncover insights and solve real national security challenges - About security features that enable Donovan to be deployed safely on classified and unclassified government networks

Sean Hendryx & Lucas Bunzel · Mar 15th, 2024

Bringing Foundation Models to Automotive Data Engines

Foundation Models offer a new paradigm for machine learning. As OEMs transition from intense R&D to scaled production for advanced self-driving systems, Foundation Models will be the key to achieving safety and efficiency. Scale's Automotive Foundation Model is the next evolution of the Automotive Data Engine -- empowering teams to deliver advanced computer vision capabilities for autonomous vehicles to safely perceive and navigate complex environments. In this talk, Sean Hendryx, Engineering Manager, Machine Learning at Scale AI, will share how Scale is delivering the most advanced computer vision technology to accelerate the development of autonomous vehicles.

David Rokeach, Dan Martines & Padma Elmgart · Mar 15th, 2024

Blueprint to AI: Implementing Generative AI Across Your Enterprise

Scale and BCG share their Blueprint to AI, and lessons learned from implementing Generative AI for the world’s largest enterprises. Padma Elmgart, Chief Technology Officer at Global Atlantic Financial Group joins to share her company’s journey implementing Generative AI across a large organization and how they are using Scale GenAI Platform to scale up use cases in 2024.

Eleanor Runde, Dr. Benjamin Jensen, Will Gamble & 1 content:more content:speaker · Mar 11th, 2024

AI Trust & Safety within the United States Public Sector

A panel discussion featuring Dr. Jane Pinelis, Chief AI Engineer, The Johns Hopkins University Applied Physics Laboratory, Dr. Benjamin Jensen, Senior Fellow for Future War, Gaming, and Strategy, CSIS, and Eleanor Runde, Policy Advisor, Office of the Secretary, Department of Commerce, moderated by Dan Tadross, Head of Delivery, Federal, Scale AI. This panel will take you on an expedition evaluating the current landscape of AI Trust & Safety, what trust and safety really means within the federal government, and the methods currently used to facilitate it.

# AI in National Security

Alex Levinson & Dan Tadross · Mar 11th, 2024

A Red Team Approach to AI Security

A Fireside Chat between Alex Levinson, Head of Security for Scale AI and Will Gamble, Director & AGC of Federal for Scale AI, on A Red Team Approach to AI Security.

# AI in National Security

# AI Policy & Governance

# Artificial General Intelligence

Colin Jarvis, Luv Kothari & Chloe Ho · Nov 7th, 2023

Fine-Tuning Open AI’s GPT-3.5 to Unlock Enterprise Use Cases

About the Tech Talk Fine-tuning is the key to unlocking the performance of LLMs for every organization’s most critical use cases. Join OpenAI and Scale as we explore what it takes to optimize fine-tuning for your organization and how you can unlock the potential of your data. From this talk, you will learn: When you should consider fine-tuning GPT-3.5 and how to get started Optimizing your proprietary data for fine-tuning GPT-3.5 and how to tell when you need to generate new data Lessons learned and best practices from helping the world's leading enterprises fine-tune GPT-3.5 for their most difficult use cases

Daniel Berrios & Dylan Slack · Sep 28th, 2023

Building Trust in AI: Testing and Evaluating Large Language Models (LLMs)

Understanding the capabilities, risks, and vulnerabilities of large language models is critical to ensuring the safety of these models. Join Scale as we discuss our vision for what an effective and comprehensive test and evaluation (“T&E”) regime for LLMs should look like moving forward, how that leverages human experts, as well as how we aim to help service this need with our new Scale LLM Test & Evaluation offering.