タグ

ブックマーク / www.anthropic.com (4)

  • Introducing Claude Design by Anthropic Labs

    Today, we’re launching Claude Design, a new Anthropic Labs product that lets you collaborate with Claude to create polished visual work like designs, prototypes, slides, one-pagers, and more. Claude Design is powered by our most capable vision model, Claude Opus 4.7, and is available in research preview for Claude Pro, Max, Team, and Enterprise subscribers. We’re rolling out to users gradually thr

    Introducing Claude Design by Anthropic Labs
    efcl
    efcl 2026/04/18
    Anthropic Labsが公開したデザインツール。 テキストプロンプト、画像、ドキュメント(DOCX/PPTX/XLSX)、コードベース、WebキャプチャなどからWebサイトデザインやプロトタイプ、スライド資料を作成できる。 チームのコードベー
  • A “diff” tool for AI: Finding behavioral differences in new models

    A “diff” tool for AI: Finding behavioral differences in new models Every time a new AI model is released, its developers run a suite of evaluations to measure its performance and safety. These tests are essential, but they are somewhat limited. Because these benchmarks are human-authored, they can only test for risks we have already conceptualized and learned to measure. This approach to safety is

    A “diff” tool for AI: Finding behavioral differences in new models
    efcl
    efcl 2026/04/12
    AIモデル間の振る舞いの違いを自動的に特定するための解釈可能性(Interpretability)ツールについて。 ソフトウェアのdiffツールの概念をAIモデルに応用し、Dedicated Feature Crosscoder(DFC)という手法を提案している。 2つのモデル間で
  • Harness design for long-running application development

    Published Mar 24, 2026 Harness design is key to performance at the frontier of agentic coding. Here's how we pushed Claude further in frontend design and long-running autonomous software engineering. Written by Prithvi Rajasekaran, a member of our Labs team. Over the past several months I’ve been working on two interconnected problems: getting Claude to produce high-quality frontend designs, and g

    Harness design for long-running application development
    efcl
    efcl 2026/03/29
    長時間実行されるAIアプリケーション開発におけるハーネス設計についての記事。 Planner/Generator/Evaluatorの構造。 単一のAIエージェントでは、コンテキストウィンドウの制約や自己評価の偏りにより、複雑なタスクで性能が低
  • Code execution with MCP: building more efficient AI agents

    Published Nov 04, 2025 Direct tool calls consume context for each definition and result. Agents scale better by writing code to call tools instead. Here's how it works with MCP. The Model Context Protocol (MCP) is an open standard for connecting AI agents to external systems. Connecting agents to tools and data traditionally requires a custom integration for each pairing, creating fragmentation an

    Code execution with MCP: building more efficient AI agents
    efcl
    efcl 2025/11/08
    Codeを実行させることでMCPよりContext消費が少なくなるという話
  • 1