Engineering at Anthropic 日本語版

Anthropic の Engineering ブログを日本語で読めるようにまとめたコレクション。新しい記事が公開されたら GitHub Actions が自動で取得・翻訳・PDF 化します。

24 本の記事 275 ページ更新日: 2026-05-01

Featured

エージェント型コーディング eval におけるインフラノイズの定量化

Quantifying infrastructure noise in agentic coding evals

SWE-bench や Terminal-Bench のようなエージェント型コーディングのベンチマークは、フロンティアモデルのソフトウェアエンジニアリング能力を比較するために広く使われており、リーダーボードの上位はわずか数パーセントポイントで分かれていることもしばしばです。これらのスコアは、モデル間の相対的な能力差を表す精密な計測値として扱われ、どのモデルを…

Managed Agents のスケーリング: 頭と手を切り離す

Scaling Managed Agents: Decoupling the brain from the hands

最近の Claude Code 品質報告に関する最新情報

An update on recent Claude Code quality reports

Claude Code auto mode: 権限プロンプトをより安全にスキップする

Claude Code auto mode: a safer way to skip permissions

長時間アプリケーション開発のためのハーネス設計

Harness design for long-running application development

Claude Opus 4.6 の BrowseComp 性能に見られる eval 気づき

Eval awareness in Claude Opus 4.6's BrowseComp performance

並列 Claude のチームで C コンパイラを作る

Building a C compiler with a team of parallel Claudes

AI に耐える技術評価を設計する

Designing AI-resistant technical evaluations

AI エージェントの eval を解きほぐす

Demystifying evals for AI agents

長時間動作するエージェントのための効果的なハーネス

Effective harnesses for long-running agents

Claude Developer Platform に高度なツール使用機能を導入

Introducing advanced tool use on the Claude Developer Platform

MCP でのコード実行: より効率的なエージェントを作る

Code execution with MCP: Building more efficient agents

権限プロンプトの先へ: Claude Code をより安全で自律的にする

Beyond permission prompts: making Claude Code more secure and autonomous

Agent Skills でエージェントを実世界に備えさせる

Equipping agents for the real world with Agent Skills

AI エージェントのための効果的なコンテキストエンジニアリング

Effective context engineering for AI agents

最近発生した 3 件の問題のポストモーテム

A postmortem of three recent issues

エージェントのための効果的なツールを、エージェントと一緒に書く

Writing effective tools for agents — with agents

Desktop Extensions: Claude Desktop 向けの 1 クリック MCP サーバーインストール

Desktop Extensions: One-click MCP server installation for Claude Desktop

マルチエージェント調査システムをどう作ったか

How we built our multi-agent research system

Claude Code: エージェント型コーディングのベストプラクティス

Claude Code: Best practices for agentic coding

"think" ツール: 複雑なツール利用で Claude に立ち止まって考えさせる

The "think" tool: Enabling Claude to stop and think in complex tool use situations

Claude 3.5 Sonnet で SWE-bench Verified のベースラインを引き上げる

Raising the bar on SWE-bench Verified with Claude 3.5 Sonnet

効果的なエージェントの作り方

Building effective agents

Contextual Retrieval のご紹介

Introducing Contextual Retrieval