
BC Law AI News & Insights: October 2025 Edition
In this newsletter:
- Shareable Gemini Gems: Google enables sharing of custom AI assistants in BC’s Gemini deployment—try out GAIL and the Custom Assistant Builder.
- AI Approaches Expert Performance: A landmark benchmark study shows frontier models reaching near-parity with industry experts on professional tasks—and what that means for AI literacy.
- The Art of Intent: Why “just talking” with AI assistants works better than complex prompting—plus tips for directing AI like a film director.
- The Shift Toward Directive Delegation: How legal professionals are increasingly using AI for automated workflows, plus insights on AI’s impact on legal education.
- Latest Model Releases: New Claude Sonnet 4.5 and Haiku 4.5, plus AI fluency courses from Anthropic and curated links on AI, education, and law.
BC Tool Updates: Gemini Gems Are Now Shareable
Google has enabled sharing for Gems in institutional accounts, including BC’s Gemini deployment. Gems are custom AI assistants you build by providing a specific role, goal, context, and resources—effectively narrowing the AI’s general knowledge to focus on bounded tasks or workflows that matter to you. Whether you need help with a process, product work, reflective practices, coaching, or learning support, Gems can be tailored to your specific needs. Now you can share these custom assistants with colleagues or classmates via link.
Why this matters for BC Law: This enables our community to build and distribute custom AI functionality and workflows rather than everyone starting from scratch.
Try out a few Gems created by Kyle Fidalgo, Academic Technologist for BC Law.
GAIL helps you understand AI concepts in practical, everyday terms. The Custom Assistant Builder translates your ideas into system prompts—text instructions that configure AI assistants for specific tasks or repeatable workflows. You can use the custom assistant builder to develop system prompts for your own Gems!

New Research Suggests AI Approaches Expert Performance on Professional Tasks
A major benchmark study released in late September 2025 marks an important milestone: frontier models are now approaching the quality of work produced by industry experts on bounded professional tasks.
OpenAI’s new GDPval evaluation tested leading models on 1,320 expert-designed tasks spanning 44 occupations across the top nine sectors contributing to U.S. GDP. Task writers—professionals averaging 14 years of experience—created deliverables reflecting actual work: client memos, blueprints, care plans, regulatory analyses. Expert graders then blindly compared model outputs against human-produced work.
The strongest model, Claude Opus 4.1, achieved roughly 46% win rate against industry experts in their respective fields. GPT-5 with extended reasoning enabled came in second at 38.8%.
Perhaps the more revealing finding comes from what happened when models lost. A third-party grader assessed whether the model output was catastrophic (harmful/dangerous), bad (not fit for use), acceptable but subpar, or that the original assessment was wrong and the model deliverable was actually better than the human. The results:
- 2.7% catastrophic
- 26.7% bad
- 47.7% acceptable but subpar
- 22.9% judged better than the human expert
Even when the models failed to beat the expert, 70.6% of outputs were considered as acceptable or better. Catastrophic failures, the kind that would harm a client or project, occurred in fewer than 3 out of 100 cases.
For legal work specifically, tasks were substantive: drafting comprehensive client memos assessing regulatory compliance, preparing police training handouts covering constitutional standards, analyzing shareholder litigation risk with statutory citations. These were not theoretical exercises. They’re the kind of deliverables that affect billable hours and case outcomes.
Some Key Caveats and Insights
One important element to consider here is that these evaluations tested models on specific tasks, not full jobs or entire roles. The significance of approaching parity with industry experts on expert-designed tasks shouldn’t be understated. It means AI can now produce work products—the memo, the brief, the analysis—that meet professional standards on bounded assignments.
Another caveat is that GDPval tested models in one-shot scenarios without the iterative refinement that defines real human-AI collaboration. Models approached expert parity without multi-turn workflows, client feedback, or contextual refinement. This means the benchmark may understate the potential value—professionals who develop AI literacy to prompt, review, and iterate effectively will likely see better results than these single-attempt baselines suggest.

The Productivity Paradox
The GDPval study also tested a more realistic workflow: experts working with AI rather than against it. In this scenario, professionals delegated tasks to models as a first pass, reviewed the output, re-prompted if needed, and finished the work themselves only when necessary.
On average, this human-AI collaboration resulted in work completed 40% faster and 60% cheaper than human-only workflows.
That sounds transformative. But there’s a critical catch.
A recent Harvard Business Review article discusses the concept of “workslop”: AI-generated work that masquerades as good but lacks substance to meaningfully advance a task. The idea here being that workslop shifts the burden downstream. The recipients must decode vague output, infer missing context, and often redo work entirely.
According to the research, 40% of workers have received workslop in the past month. Each incident costs an average of nearly two hours to remediate. The difference comes down to what the researchers call the “pilot versus passenger” mindset. Pilots use AI with high agency and clear intent—they know what they want to achieve and evaluate outputs critically. Passengers use AI to avoid work, copying and pasting without review or reflection.
The Call for AI Competency
This is why AI literacy—not just AI adoption—becomes the real professional imperative. As tools quickly approach expert performance on bounded tasks the question isn’t about whether we should or shouldn’t use the tools. It’s whether professionals can develop:
- The judgment to delegate appropriately
- The critical eye to catch model failures
- The iterative approach to refine outputs effectively
- The wisdom to preserve what makes human expertise valuable: contextual understanding, ethical reasoning, and the ability to know what questions to ask
The GDPval benchmark shows us the ceiling of current capability. The workslop research shows us the floor of poor execution. The gap between them isn’t technical—it’s human.
As Ethan Mollick recently noted, we need to move past debating whether AI works and instead focus on “a deep exploration of when AI uses are uplifting and when they are detrimental.”

Prompt Tips & Techniques
The Art of Intent: Why “Just Talking” Works
This month’s tip centers on a deceptively simple idea: focus your intent and get comfortable just talking with AI assistants.
The folks at The Neuron Daily recently explored how AI collaboration hinges on “intent”—your ability to articulate what you’re trying to achieve. It reminded them of a memory from a college film class, where their instructor distilled directing into two essential skills: have a vision, and communicate that vision. Everything else? The crew handles it.
When you work with AI, one role you might take on is that of a director.
Your Three-Act Structure:
- Define success – What’s the exact goal of this project?
- Communicate that intent – Use prompts, agents, and workflows to direct your AI.
- Evaluate the result – Did you hit your mark? Review the output and adjust for the next take.
A Simple Rehearsal Trick
Want to ensure your AI understands the assignment? Add this to the end of your prompt:
Before you begin, please restate my goal for this project and the key steps you will take to achieve it. Ask for my confirmation before you proceed.
This lets you retain control over the process, catching misunderstandings early so that you can course correct as needed.
When in Doubt, Talk it Out
Figuring out what to ask of AI, where they can best help you in the process, and which AI tools suit your task takes time and practice. The workflow might feel clunky at first. The quality might miss your standards. It might fail you in unexpected ways. That’s normal.
The best advice? Just start talking.
Try this: Open a conversation about who you are. Your role. The projects or papers you’re working on. How you prefer to work. What you aspire to create. Writing or work you admire—and why you admire it. Take time to articulate what makes those examples resonate with you.
Then, with that rich context queued up, ask for help with a specific, well-bounded task you’re working on right now.
You might be surprised. The specificity and detail that come back will likely far exceed what a generic, cold-open request would generate. You’ve given the AI your creative brief. Now you’re directing together.
Bonus Tip: If you like what you’ve collaborated on together you can now “package” that work into a Gemini Gem. This means you don’t have to build this context from scratch for every conversation. Just ask your AI assistant how you can write a system prompt to capture the repeatable workflow for future work together. With BC’s access to shareable Gems, you can then copy the generated system prompt and paste that into the instructions section for your Gem to try it out.

Quick Bites
The Shift Toward Directive Delegation
Anthropic’s September Economic Index report suggests the shift in how we delegate work to AI is already underway. Analyzing millions of Claude conversations, they found directive usage—where users assign tasks and Claude completes them with minimal back-and-forth—jumped from 27% in late 2024 to 39% by August 2025. They interpret this as models improving at anticipating needs and producing higher-quality first drafts, reducing the need for iterative refinements. It may also reflect growing user confidence: learning by doing as people realize AI can handle what they previously could not.
One interesting data point for the BC Law community is that legal use cases are overrepresented across multiple states’ top-10 usage patterns. South Carolina’s leading use case is comprehensive legal assistance and document drafting across practice areas. Florida and Texas both over-index on legal tasks compared to other Claude requests. Anthropic’s Job Explorer reveals legal occupations—lawyers, paralegals, legal secretaries, judges, law clerks—lean heavily toward directive, automated workflows, though lawyers also show significant task iteration, learning, and validation patterns.
How AI Is Changing Legal Education
This segment of the “AI and the Future of Law” podcast features hosts Jen Leonard and Bridget McCormack in conversation with legal educators Dyane O’Leary (Suffolk University Law School) and Jonah Perlin (Georgetown University Law Center) about the impact of generative AI on legal education and the future of lawyering. There are several interesting takeaways and topics of discussion including:
- Shifting from “What” to “How”: The focus of legal education is shifting from simply mastering the substantive black-letter law (“what”) to mastering the meta-skills of knowing how to approach new tools, identify new efficiencies, and apply legal judgment in evolving workflows.
- The Bluebook Challenge: The conversation around the new Bluebook Rule 18.3 for citing generative AI output is an excellent case study in how slow-moving legal traditions clash with rapid technological change, providing a rich context for discussing the nature of legal authority and innovation.
- The Enduring Value of “Thinking Like a Lawyer”: Jonah Perlin argues that the core skills of legal reasoning, strategy, judgment, history, and reasoning will become more important, not less, in an AI-enhanced world, as these are the differentiating factors for human lawyers.
- Balancing Innovation with Integrity: Law schools continue grappling with the tension of integrating AI literacy while maintaining academic integrity, leading to a resurgence of closed-book exams and redesigned assignments that test students’ authentic legal skills.
- Need for Clear Workplace Policies: Dyane O’Leary notes a wide variation in law firms’ AI policies, emphasizing that students need to be taught to ask a “question set” about the ethical and procedural rules for AI use at their specific workplace (e.g., what can I use, when can I use it, and how should I document it).

Additional News, Essays, and Links of Interest
New model releases from Anthropic
The next iterations of Anthropic’s family of AI model’s have been released.
- Introducing Claude Sonnet 4.5 – Anthropic’s latest frontier model offering improved performance and extended capabilities.
- Introducing Claude Haiku 4.5 – A faster, more cost-effective option for high-volume tasks requiring quick responses.
If you’ve ever been curious how model labs instruct their frontier models you can take a look at the system prompt resource Anthropic provides.
New AI fluency courses from Anthropic
Anthropic has released two new courses specifically designed for education audiences to build AI fluency and competency skills:
- AI Fluency for Students – This course empowers students to develop AI Fluency skills that enhance learning, career planning, and academic success through responsible AI collaboration.
- AI Fluency for Educators – This course empowers academic faculty, instructional designers, and others to teach and assess AI Fluency in instructor-led settings.
Other Essays, News, & Links of Interest
- What is AI doing to Higher Education. In this clip from the Hard Fork Podcast, Princeton historian D. Graham Burnett joins to discuss the existential threat that A.I. poses to the traditional humanities degree and why he believes we’ll see thousands of new schools emerge outside the university system to carry on the exploration of what it means to be a person in the world.
- AI as teleportation – An essay on introducing intentional friction and mindfulness in our collaborative efforts with AI tools.
- How I AI: Prototyping AI use cases by role playing conversations. – Product manager Priya Badger demonstrates how to design AI-powered features by role-playing example conversations first, then working backward to refine system prompts and interfaces. A practical workflow for prototyping AI products with conversational language.
- Conversation between AI Expert Andrej Karpathy and Dwarkesh Patel. Karpathy discusses why reinforcement learning is flawed but necessary, how model collapse limits AI learning compared to humans, why AGI will integrate gradually rather than explosively, and his vision for education’s future.
- Researchers at Google and Yale may have discovered new pathways to fight cancer using a new language model trained on “the language of cells”.
Have questions or ideas? Want help creating your own AI workflows? Reach out to Kyle Fidalgo @ atrinbox@bc.edu.
Ready to build your AI competency? Discover AI literacy resources at AI Foundations.