Dec 1, 2025
From Thoughts to Code, Config Playbooks, and a DeepMind Doc
🧩 The Gist
Researchers propose Program-of-Thoughts prompting, where models write programs to represent reasoning and an external computer does the computation. In tests across math and financial QA benchmarks, this approach improves over chain-of-thought and reaches state-of-the-art with self-consistency on math tasks. A practical guide positions CLAUDE.md as a high-leverage config for Claude Code and agent workflows. A new documentary spotlights DeepMind’s pursuit of AGI, sparking community discussion about AI’s purpose.
🚀 Key Highlights
- Program-of-Thoughts (PoT) has models express reasoning as programs, then offloads computation to an external executor.
- Evaluated on five math word problem sets and three financial QA sets, PoT shows around 12 percent average gains over chain-of-thought in few-shot and zero-shot settings.
- With self-consistency decoding, PoT achieves state-of-the-art on all math datasets and near state-of-the-art on financial datasets, with code and data released on GitHub.
- Writing a good CLAUDE.md frames the file as a high-leverage configuration point for Claude Code, and a core skill for agent-enabled software engineering.
- The post highlights CLAUDE.md or AGENTS.md as the locus for guiding tool behavior across tasks and repos.
- The Thinking Game is a documentary offering an inside look at DeepMind’s work and its aim to understand artificial general intelligence, drawing notable community interest.
🎯 Strategic Takeaways
- Research and engineering
- Treat numerical reasoning as code, not prose. Executable reasoning can reduce error on arithmetic and multi step tasks, and pairs well with self-consistency.
- Developer experience
- Centralize agent guidance. A well scoped CLAUDE.md or AGENTS.md can act as the project’s source of truth for agent behavior, similar to a lint or CI config.
- Product and adoption
- Public narratives shape expectations. High visibility media on AGI can influence how teams communicate use cases and safeguards to users and stakeholders.
🧠 Worth Reading
- Program-of-Thoughts Prompting: The core idea is to disentangle reasoning from computation by having the model generate a program that encodes its reasoning, then execute that program to get the final answer. In evaluations, this yields around 12 percent average improvement over chain-of-thought and reaches state-of-the-art on math tasks when combined with self-consistency. Practical takeaway, if your domain involves numerical or structured reasoning, consider prompting models to produce executable programs and run them to determine answers.
Inside Mirakl’s Agent Commerce Vision (openai.com) Mirakl is redefining commerce through AI agents and ChatGPT Enterprise—achieving faster documentation, smarter customer support, and building toward agent-native commerce with Mirakl Nexus. openai
OpenAI takes an ownership stake in Thrive Holdings to accelerate enterprise AI adoption (openai.com) OpenAI takes an ownership stake in Thrive Holdings to accelerate enterprise AI adoption, embedding frontier research and engineering directly into accounting and IT services to boost speed, accuracy,… openai
Accenture and OpenAI accelerate enterprise AI success (openai.com) Accenture is rolling out 40,000 ChatGPT Enterprise licenses and naming OpenAI its primary intelligence partner to power enterprise AI success, upskill teams, and deliver AI-driven client outcomes. openai