並び順

ブックマーク数

期間指定

  • から
  • まで

1 - 40 件 / 108件

新着順 人気順

computer use agents githubの検索結果1 - 40 件 / 108件

  • Devinを導入して1ヶ月経ったので、人間とAIとでどのような開発の役割分担をするべきか振り返ってみる - Generative Agents Tech Blog

    こんにちは、ジェネラティブエージェンツの西見です。 「完全自律型AIエンジニア」という触れ込みと、その印象的なティザー動画で一躍有名になったDevinが、2024年12月10日にGAしました。 www.cognition.ai それからしばらく経ったこともあって、X上でもチラホラと日本企業におけるDevin採用報告が聞こえてくるようになり、「こんなタスクには使えた😆」「簡単なタスクにハマり続けて使えない、金もったいない😭」といったポストがよく見られるようになりました。 正直なところ、月500ドルは高いなぁ・・・*1なんて思っていたのですが、弊社も多分に漏れず猫の手も借りたい状況なのもあって、2025年1月22日からDevin(猫の手)を採用してみました。それからちょうど1ヶ月が経ったので、弊社の開発状況にどんな変化があったのかを振り返って、レポートしてみたいと思います。 GitHubア

      Devinを導入して1ヶ月経ったので、人間とAIとでどのような開発の役割分担をするべきか振り返ってみる - Generative Agents Tech Blog
    • Claude 3.7 Sonnet and Claude Code

      Today, we’re announcing Claude 3.7 Sonnet1, our most intelligent model to date and the first hybrid reasoning model on the market. Claude 3.7 Sonnet can produce near-instant responses or extended, step-by-step thinking that is made visible to the user. API users also have fine-grained control over how long the model can think for. Claude 3.7 Sonnet shows particularly strong improvements in coding

        Claude 3.7 Sonnet and Claude Code
      • Introducing Claude Opus 4.5

        Our newest model, Claude Opus 4.5, is available today. It’s intelligent, efficient, and the best model in the world for coding, agents, and computer use. It’s also meaningfully better at everyday tasks like deep research and working with slides and spreadsheets. Opus 4.5 is a step forward in what AI systems can do, and a preview of larger changes to how work gets done. Claude Opus 4.5 is state-of-

          Introducing Claude Opus 4.5
        • Microsoft Build 2025の新発表まとめ【30選】

          はじめまして、ますみです! 株式会社Galirage(ガリレージ)という「生成AIに特化して、システム開発・アドバイザリー支援・研修支援をしているIT企業」で、代表をしております^^ この記事では、Microsoft Build 2025の発表内容をまとめていきたいと思います🎉 もしも現地で参加している方は、ぜひ会場で見かけたらお声がけいただけたら嬉しいです^^ ちなみに、現地のKeynoteの会場の雰囲気はこんな感じでした!!! イントロダクション まず、CEOのサティア・ナデラさんは、Building the open agentic web という世界観を発表しました! このフレーズは、Build 2025の重要なテーマであり、この後の最新発表につながっています! さらに、以下のDeveloper tools と 次の4段階のレイヤーに分類をして、これ以降の発表をしていきます。 A

            Microsoft Build 2025の新発表まとめ【30選】
          • Code Interpreter API

            Editor's Note: This is another installation of our guest blog posts highlighting interesting and novel use cases. This blog is written by Shroominic who built an open source implementation of the ChatGPT Code Interpreter. Important Links: GitHub RepoIn the world of open-source software, there are always exciting developments. Today, I am thrilled to announce a new project that I have been working

              Code Interpreter API
            • 2024年生成AIエージェントのおすすめ論文 16選 - 襖からキリン

              こんにちは! AIエージェントに一年を捧げた太田(https://x.com/ottamm_190)です。 年末のエージェント記事の第四弾です。 第一弾→ Weekly AI Agent News!から見えたAIエージェントの現在地 - 襖からキリン 第二弾→ AIエージェントビジネスの現状と今後の考察 - 襖からキリン 第三弾→ 生成AIエージェントが刺さる業務課題を探そう! - 襖からキリン 今年のWeekly AI Agents News!を更新し続けて個人的に学びがあった論文を紹介します。 特に研究者よりかはビジネス層やエンジニア層に読んで学びがありそうなのを満遍なく16本紹介します。 キリ良く15本には削れなかったですね。はい。 読者層は真ん中 ぜひ、年末にお手元の生成AIを使って読んでみてください。 質問例も載せておきます。(生成結果は確認していませんが、当時聞いたような記憶も

                2024年生成AIエージェントのおすすめ論文 16選 - 襖からキリン
              • GitHub - modelcontextprotocol/servers: Model Context Protocol Servers

                Official integrations are maintained by companies building production ready MCP servers for their platforms. 21st.dev Magic - Create crafted UI components inspired by the best 21st.dev design engineers. 2slides - An MCP server that provides tools to convert content into slides/PPT/presentation or generate slides/PPT/presentation with user intention. ActionKit by Paragon - Connect to 130+ SaaS inte

                  GitHub - modelcontextprotocol/servers: Model Context Protocol Servers
                • GitHub - bregman-arie/devops-exercises: Linux, Jenkins, AWS, SRE, Prometheus, Docker, Python, Ansible, Git, Kubernetes, Terraform, OpenStack, SQL, NoSQL, Azure, GCP, DNS, Elastic, Network, Virtualization. DevOps Interview Questions

                  In general, what do you need in order to communicate? A common language (for the two ends to understand) A way to address who you want to communicate with A Connection (so the content of the communication can reach the recipients) What is TCP/IP? A set of protocols that define how two or more devices can communicate with each other. To learn more about TCP/IP, read here What is Ethernet? Ethernet

                    GitHub - bregman-arie/devops-exercises: Linux, Jenkins, AWS, SRE, Prometheus, Docker, Python, Ansible, Git, Kubernetes, Terraform, OpenStack, SQL, NoSQL, Azure, GCP, DNS, Elastic, Network, Virtualization. DevOps Interview Questions
                  • AI破産を防ぐために - LLM API利用におけるEconomic DoSのリスクと対策 - GMO Flatt Security Blog

                    はじめに こんにちは、GMO Flatt Security株式会社セキュリティエンジニアの松井(@ryotaromosao)です。 近年、LLM(大規模言語モデル)が目覚ましい進化を遂げており、それを利用したLLMアプリケーションが急速に増加しています。特に、AIチャット機能やエージェント機能が既存のサービスに搭載されるのを目にする機会も多いと思います。 しかしながら、LLM APIを用いたアプリケーションを提供する事業者にとって、「高額なAPIの利用料金を請求されたらどうしよう」という不安は大きいのではないでしょうか。 私も自社開発のセキュリティ診断AIエージェントのTakumiを使って脆弱性診断やリサーチ活動をしていますが、そのLLM APIの利用料金にはいつもビクビクしています。 まだ最適化が為されていなかった、Takumiの開発中の話ではありますが、脆弱性のリサーチ中に「このリポジ

                      AI破産を防ぐために - LLM API利用におけるEconomic DoSのリスクと対策 - GMO Flatt Security Blog
                    • The End of Programming – Communications of the ACM

                      The end of classical computer science is coming, and most of us are dinosaurs waiting for the meteor to hit. I came of age in the 1980s, programming personal computers such as the Commodore VIC-20 and Apple ][e at home. Going on to study computer science (CS) in college and ultimately getting a Ph.D. at Berkeley, the bulk of my professional training was rooted in what I will call “classical” CS: p

                      • Build and deploy Remote Model Context Protocol (MCP) servers to Cloudflare

                        Build and deploy Remote Model Context Protocol (MCP) servers to Cloudflare2025-03-25 It feels like almost everyone building AI applications and agents is talking about the Model Context Protocol (MCP), as well as building MCP servers that you install and run locally on your own computer. You can now build and deploy remote MCP servers to Cloudflare. We’ve added four things to Cloudflare that handl

                          Build and deploy Remote Model Context Protocol (MCP) servers to Cloudflare
                        • Claude Skills are awesome, maybe a bigger deal than MCP

                          Claude Skills are awesome, maybe a bigger deal than MCP 16th October 2025 Anthropic this morning introduced Claude Skills, a new pattern for making new abilities available to their models: Claude can now use Skills to improve how it performs specific tasks. Skills are folders that include instructions, scripts, and resources that Claude can load when needed. Claude will only access a skill when it

                            Claude Skills are awesome, maybe a bigger deal than MCP
                          • What We Learned from a Year of Building with LLMs (Part I)

                            It’s an exciting time to build with large language models (LLMs). Over the past year, LLMs have become “good enough” for real-world applications. The pace of improvements in LLMs, coupled with a parade of demos on social media, will fuel an estimated $200B investment in AI by 2025. LLMs are also broadly accessible, allowing everyone, not just ML engineers and scientists, to build intelligence into

                              What We Learned from a Year of Building with LLMs (Part I)
                            • 2025: The year in LLMs

                              31st December 2025 This is the third in my annual series reviewing everything that happened in the LLM space over the past 12 months. For previous years see Stuff we figured out about AI in 2023 and Things we learned about LLMs in 2024. It’s been a year filled with a lot of different trends. The year of “reasoning” The year of agents The year of coding agents and Claude Code The year of LLMs on th

                                2025: The year in LLMs
                              • AIエージェント時代のWeb〜いま、第二のレスポンシブ設計が始まっている - Nothing ventured, nothing gained.

                                ブラウザを開いて、AIエージェントに「最も静音なノイズキャンセリングイヤホンを探して、明日届くように手配しておいて」と頼む。エージェントは複数のECサイトを回り、レビューを比較し、カートに入れて配送指定をした上で、決済画面で「ここから先は確認をお願いします」と返してくる。 このとき、ブラウザの向こう側で何が起きているのか。エージェントはピクセルを目で見ているのか、HTMLを解釈しているのか、それともサイト側が用意した「エージェント向けの入口」を使っているのか。 AIエージェント時代のWebがどう変わっていくのかは、私自身ずっと気になっていたテーマだった。最近腰を据えて調べてみたところ、思っていた以上に議論と実装が進んでいた。今回の記事では、私が学んだ範囲で、いまWebのアーキテクチャへの変更を促しつつある二つの標準技術──WebMCPとNLWeb──を、コードスニペットを含めて紹介していく

                                  AIエージェント時代のWeb〜いま、第二のレスポンシブ設計が始まっている - Nothing ventured, nothing gained.
                                • Things we learned about LLMs in 2024

                                  31st December 2024 A lot has happened in the world of Large Language Models over the course of 2024. Here’s a review of things we figured out about the field in the past twelve months, plus my attempt at identifying key themes and pivotal moments. This is a sequel to my review of 2023. In this article: The GPT-4 barrier was comprehensively broken Some of those GPT-4 models run on my laptop LLM pri

                                    Things we learned about LLMs in 2024
                                  • Open challenges in LLM research

                                    [LinkedIn discussion, Twitter thread] Never before in my life had I seen so many smart people working on the same goal: making LLMs better. After talking to many people working in both industry and academia, I noticed the 10 major research directions that emerged. The first two directions, hallucinations and context learning, are probably the most talked about today. I’m the most excited about num

                                      Open challenges in LLM research
                                    • Wasm-agents: AI agents running in your browser

                                      One of the main barriers to a wider adoption and experimentation with open-source agents is the dependency on extra tools and frameworks that need to be installed before the agents can be run. In this post, we introduce the Wasm agents blueprint, aimed at showing how to write agents as HTML files, which can just be opened and run in a browser, without the need for any extra dependencies. This is s

                                        Wasm-agents: AI agents running in your browser
                                      • Why I stopped using AI code editors · Luciano Nooijen

                                        TL;DR: I chose to make using AI a manual action, because I felt the slow loss of competence over time when I relied on it, and I recommend everyone to be cautious with making AI a key part of their workflow. In late 2022, I used AI tools for the first time, even before the first version of ChatGPT. In 2023, I started using AI-based tools in my development workflow. Initially, I was super impressed

                                        • Agents

                                          Intelligent agents are considered by many to be the ultimate goal of AI. The classic book by Stuart Russell and Peter Norvig, Artificial Intelligence: A Modern Approach (Prentice Hall, 1995), defines the field of AI research as “the study and design of rational agents.” The unprecedented capabilities of foundation models have opened the door to agentic applications that were previously unimaginabl

                                            Agents
                                          • AIエージェントビジネスの現状と今後の考察 - 襖からキリン

                                            こんにちは!年末記事の第二弾、AIエージェントに関するビジネス記事になります。 現状のエージェントはどうなっているのか、今後エージェントを始める方が参考になるように説明します。 第一弾の記事は既に公開されています。 Weekly AI Agent News!から見えたAIエージェントの現在地 - 襖からキリン 私が公開しているWeekly AI Agent News!や論文のリポジトリはこちらです。 speakerdeck.com github.com AIエージェントに取り組む人材とは? 企業のAIエージェントの状況 現状の主力エージェント製品を解説 エージェントビルダー リサーチ、問い合わせ対応 データに基づく意思決定支援 様々なソースから資料作成 Agentic Process Automation これからのエージェントを考える 生成AIエージェントと業務ソフトウェアの結びつきが強

                                              AIエージェントビジネスの現状と今後の考察 - 襖からキリン
                                            • From Coder to Orchestrator: The future of software engineering with AI - Human Who Codes

                                              The software engineering industry is undergoing a major AI-driven transition in how we work. The days when humans needed to write every line of code are already behind us as LLMs become more capable and reliable. The improvement in code output during 2025 alone has been astounding. I’ve personally watched LLMs struggle with certain problems, then a few months later, solve them completely and effic

                                                From Coder to Orchestrator: The future of software engineering with AI - Human Who Codes
                                              • GitHub - gptme/gptme: Your agent in your terminal, equipped with local tools: writes code, uses the terminal, browses the web. Make your own persistent autonomous agent on top!

                                                Coming soon - gptme.ai service for running agents in the cloud; gptme desktop app for easy local use. 2026-01 - gptme-agent-template v0.4: Bob reaches 1700+ autonomous sessions, autonomous run loops, enhanced context generation 2025-12 - v0.31.0: Background jobs, form tool, cost tracking, content-addressable storage 2025-11 - v0.30.0: Plugin system, context compression, subagent planner mode 2025-

                                                  GitHub - gptme/gptme: Your agent in your terminal, equipped with local tools: writes code, uses the terminal, browses the web. Make your own persistent autonomous agent on top!
                                                • GitHub - trycua/cua: Open-source infrastructure for Computer-Use Agents. Sandboxes, SDKs, and benchmarks to train and evaluate AI agents that can control full desktops (macOS, Linux, Windows).

                                                  You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert

                                                    GitHub - trycua/cua: Open-source infrastructure for Computer-Use Agents. Sandboxes, SDKs, and benchmarks to train and evaluate AI agents that can control full desktops (macOS, Linux, Windows).
                                                  • Opus 4.5 is going to change everything

                                                    Edit: A lot of folks have been asking what worfklows I used to write these apps. I used GitHub Copilot in VS Code with a custom agent prompt that you’ll find toward the end of this post. Context7 was the only MCP I used. I mostly just used the built-in voice dictation feature and talked to Claude. No fancy workflows, planning, etc required. The agent harness in VS Code for Opus 4.5 is so good - yo

                                                      Opus 4.5 is going to change everything
                                                    • Ten Years, Starting Again: My Journey with TiDB

                                                      The most precious things in life are memories and reflection. After we released the next generation TiDB Cloud, I think it is time for some reflection. Time flies — ten years have passed. On April 1, 2015, Max asked me, very seriously on April Fools’ Day, “Do you want to start a company together?” From that moment, I jumped on the TiDB train. The ride has been bumpy and brilliant. In these ten yea

                                                        Ten Years, Starting Again: My Journey with TiDB
                                                      • claude-cycles.dvi

                                                        Claude’s Cycles Don Knuth, Stanford Computer Science Department (28 February 2026; revised 06 March 2026) Shock! Shock! I learned yesterday that an open problem I’d been working on for several weeks had just been solved by Claude Opus 4.6—Anthropic’s hybrid reasoning model that had been released three weeks earlier! It seems that I’ll have to revise my opinions about “generative AI” one of these d

                                                        • The Death of the Stubborn Developer

                                                          The Death of the Stubborn Developer I wrote a blog post back in May called The Death of the Junior Developer. It made people mad. My thesis has since been corroborated by a bunch of big companies, and it is also happening in other industries, not just software. It is a real, actual problem, despite being quite inconvenient for almost everyone involved. My beehive-kicking post’s main premise is pre

                                                          • Patterns for Building LLM-based Systems & Products

                                                            Patterns for Building LLM-based Systems & Products [ llm engineering production 🔥 ] · 66 min read Discussions on HackerNews, Twitter, and LinkedIn “There is a large class of problems that are easy to imagine and build demos for, but extremely hard to make products out of. For example, self-driving: It’s easy to demo a car self-driving around a block, but making it into a product takes a decade.”

                                                              Patterns for Building LLM-based Systems & Products
                                                            • Claude Code is the Inflection Point

                                                              4% of GitHub public commits are being authored by Claude Code right now. At the current trajectory, we believe that Claude Code will be 20%+ of all daily commits by the end of 2026. While you blinked, AI consumed all of software development. Our sister publication Fabricated Knowledge described software like linear TV during the rise of the internet and thinks that the rise of Claude Code is going

                                                                Claude Code is the Inflection Point
                                                              • The economic potential of generative AI: The next productivity frontier

                                                                The economic potential of generative AI: The next productivity frontier Generative AI is poised to unleash the next wave of productivity. We take a first look at where business value could accrue and the potential impacts on the workforce. AI has permeated our lives incrementally, through everything from the tech powering our smartphones to autonomous-driving features on cars to the tools retailer

                                                                  The economic potential of generative AI: The next productivity frontier
                                                                • Microservices Are a Tax Your Startup Probably Can’t Afford

                                                                  Let’s unpack why microservices often backfire early on, where they genuinely help, and how to structure your startup’s systems for speed and survival. Monoliths Are Not the EnemyIf you’re building some SaaS product, even a simple SQL database wrapper eventually may bring a lot of internal complexity in the way your business logic works; additionally, you can get to various integrations and backgro

                                                                    Microservices Are a Tax Your Startup Probably Can’t Afford
                                                                  • Letter to Arc members 2025

                                                                    Untitled (to a man, George McGovern) 2, Dan Flavin. Dia Beacon, 2024.Dear Arc members,You’re probably wondering what happened. One day we were all-in on Arc. Then, seemingly out of nowhere, we started building something new: Dia. From the outside, this pivot might look abrupt. Arc had real momentum. People loved it. But inside, the decision was slower and more deliberate than it may seem. So I wan

                                                                      Letter to Arc members 2025
                                                                    • Building agents with the Claude Agent SDK

                                                                      Published Sep 29, 2025 The Claude Agent SDK is a collection of tools that helps developers build powerful agents on top of Claude Code. In this article, we walk through how to get started and share our best practices. Last year, we shared lessons in building effective agents alongside our customers. Since then, we've released Claude Code, an agentic coding solution that we originally built to supp

                                                                        Building agents with the Claude Agent SDK
                                                                      • Real-world gen AI use cases from the world's leading organizations | Google Cloud Blog

                                                                        AI is here, AI is everywhere: Top companies, governments, researchers, and startups are already enhancing their work with Google's AI solutions. Published April 12, 2024; last updated October 9, 2025. Automotive & Logistics Business & Professional Services Financial Services Healthcare & Life Sciences Hospitality & Travel Manufacturing, Industrial & Electronics Media, Marketing & Gaming Public Sec

                                                                          Real-world gen AI use cases from the world's leading organizations | Google Cloud Blog
                                                                        • Agents have their own computers with Sandboxes GA

                                                                          When we launched Cloudflare Sandboxes last June, the premise was simple: AI agents need to develop and run code, and they need to do it somewhere safe. If an agent is acting like a developer, this means cloning repositories, building code in many languages, running development servers, etc. To do these things effectively, they will often need a full computer (and if they don’t, they can reach for

                                                                            Agents have their own computers with Sandboxes GA
                                                                          • The Next Two Years of Software Engineering

                                                                            January 5, 2026 The software industry sits at a strange inflection point. AI coding has evolved from autocomplete on steroids to agents that can autonomously execute development tasks. The economic boom that fueled tech’s hiring spree has given way to an efficiency mandate: companies now often favor profitability over growth, experienced hires over fresh graduates, and smaller teams armed with bet

                                                                              The Next Two Years of Software Engineering
                                                                            • GitHub - bytedance/UI-TARS-desktop: The Open-Source Multimodal AI Agent Stack: Connecting Cutting-Edge AI Models and Agent Infra

                                                                              [2025-11-05] 🎉 We're excited to announce the release of Agent TARS CLI v0.3.0! This version brings streaming support for multiple tools (shell commands, multi-file structured display), runtime settings with timing statistics for tool calls and deep thinking, Event Stream Viewer for data flow tracking and debugging. Additionally, it features exclusive support for AIO agent Sandbox as isolated all-

                                                                                GitHub - bytedance/UI-TARS-desktop: The Open-Source Multimodal AI Agent Stack: Connecting Cutting-Edge AI Models and Agent Infra
                                                                              • How Claude Code is built

                                                                                Claude Code has taken the developer world by storm since being made generally available in May. The tool is currently generating more than $500M in annual run-rate revenue, and usage has exploded by more than 10x in the three months since that May release. I recently sat down with two of the founding engineers behind Claude Code: Boris Cherny (the engineer who came up with the original prototype,

                                                                                  How Claude Code is built
                                                                                • Using Amazon Bedrock Agents to interactively generate infrastructure as code | Amazon Web Services

                                                                                  AWS Machine Learning Blog Using Amazon Bedrock Agents to interactively generate infrastructure as code In the diverse toolkit available for deploying cloud infrastructure, Amazon Bedrock Agents offers a practical and innovative option for teams looking to enhance their infrastructure as code (IaC) processes. Amazon Bedrock Agents automates the prompt engineering and orchestration of user-requested

                                                                                    Using Amazon Bedrock Agents to interactively generate infrastructure as code | Amazon Web Services