[B! agent] arrowKatoのブックマーク

Otto

Otto is the tool built for doing Work with AISkip the chat bot, and bring reasoning to your data. Define your table once, and automate thousands of tasks in minutes. Get Access

arrowKato 2024/07/18

テーブルデータの洞察をするツール

agent

リンク

SWE-bench Lite A Canonical Subset for Efficient Evaluation of Language Models as Software Engineers Carlos E. Jimenez, John Yang, Jiayi Geng March 19, 2024 SWE-bench was designed to provide a diverse set of codebase probl ems that were verifiable using in-repo unit tests. The full SWE-bench test split comprises 2,294 issue-commit pairs across 12 python repositories. Since its release, we've found t

arrowKato 2024/07/18

Devinみたいなagentのリーダーボード

Agent

リンク

Bedrock Claude Night（JAWS-UG AI/ML支部 × 東京支部コラボ） - 資料一覧 - connpass

終了 2024/04/22（月） 19:00〜 Bedrock Claude Night（JAWS-UG AI/ML支部 × 東京支部コラボ） Anthropicチームがビデオ登壇！今話題のClaude on AWSを楽しく学ぼう TakeshiFukae 他東京都品川区上大崎3-1-1 目黒セントラルスクエア17F

arrowKato 2024/05/08

の資料まとめ

リンク

来てくれClaude 3! Agents for Amazon Bedrockのモデル比較或いはチューニングの話

arrowKato 2024/05/08

来てくれClaude3は同感

Agent
AWS

リンク

AgentBench: Evaluating LLMs as Agents

Large Language Models (LLMs) are becoming increasingly smart and autonomous, targeting real-world pragmatic missions beyond traditional NLP tasks. As a result, there has been an urgent need to evaluate LLMs as agents on challenging tasks in interactive environments. We present AgentBench, a multi-dimensional evolving benchmark that currently consists of 8 distinct environments to assess LLM-as-Age

arrowKato 2024/04/26

ベンチマークの論文

LLM
Agent

リンク

GitHub - GoogleCloudPlatform/genai-databases-retrieval-app

arrowKato 2024/04/18

アプリとAPIサーバーでの責任分解までしているので中規模以上のRAGアプリを作るならこの構成が参考になるかも

リンク

LangGraph: Multi-Agent Workflows

Links Python ExamplesJS ExamplesYouTubeLast week we highlighted LangGraph - a new package (available in both Python and JS) to better enable creation of LLM workflows containing cycles, which are a critical component of most agent runtimes. As a part of the launch, we highlighted two simple runtimes: one that is the equivalent of the AgentExecutor in langchain, and a second that was a version of t

arrowKato 2024/04/15

langchainとCrewAIは蜜月の仲らしい

リンク

GitHub - joaomdmoura/crewAI-tools

arrowKato 2024/04/15

agent

リンク

Reflexion: Language Agents with Verbal Reinforcement Learning

Large language models (LLMs) have been increasingly used to interact with external environments (e.g., games, compilers, APIs) as goal-driven agents. However, it rem ains challenging for these language agents to quickly and efficiently learn from trial-and-error as traditional reinforcement learning methods require extensive training samples and expensive model fine-tuning. We propose Reflexion, a

arrowKato 2024/04/14

reflection pattern

agent

リンク

自分で答えて自分でツッコミ！リフレクションエージェントとは

エージェントに過去の行動を振り返りさせることでブラッシュアップする手法だそうです。プログラム合成や多段階の推論まで、幅広く成果が見られるとのこと。最もシンプルなものから複雑なもので3パターンあります。 Simple Reflection Reflexion(↑とつづりが違う) Language Agents Tree Search Simple Reflection 一番シンプルなリフレクションエージェント。ジェネレーターとリフレクターという2つのLLMコールがある。ジェネレーターは回答を生成するリフレクターは教師として、その回答に建設的な批評をする一定回数それを繰り返し、最後の回答だけ出力する。最もシンプルな例の図。ジェネレーター脳が生成した回答をリフレクター脳が批判・メリット・提案を並べて評価しているプロンプト新しい情報を使って前回の解答を修正する。 - 前回の講評を

arrowKato 2024/04/14

reflectionパターン

agent

リンク

Four AI Agent Strategies That Improve GPT-4 and GPT-3.5 Performance

Agentic Design Patterns Part 1 Four AI agent strategies that improve GPT-4 and GPT-3.5 performance Dear friends, I think AI agent workflows will drive massive AI progress this year — perhaps even more than the next generation of foundation models. This is an important trend, and I urge everyone who works in AI to pay attention to it. Today, we mostly use LLMs in zero-shot mode, prompting a model t

arrowKato 2024/04/12

アンドリュー先生よるAgentのworkflowの話その1

agent

リンク

GitHub - geekan/MetaGPT: 🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming

🚀 Mar. 29, 2024: v0.8.0 released. Now you can use Data Interpreter via pypi package import. Meanwhile, we integrated RAG module and supported multiple new LLMs. 🚀 Mar. 14, 2024: Our Data Interpreter paper is on arxiv. Check the example and code! 🚀 Feb. 08, 2024: v0.7.0 released, supporting assigning different LLMs to different Roles. We also introduced Data Interpreter, a powerful agent capable

arrowKato 2024/04/11

star数の多い agent ライブラリ

agent

リンク

GitHub - e2b-dev/awesome-ai-agents: A list of AI autonomous agents

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert

arrowKato 2024/04/11

AI agentライブラリまとめ

agent

リンク

GitHub - princeton-nlp/SWE-agent: SWE-agent takes a GitHub issue and tries to automatically fix it, using GPT-4, or your LM of choice. It solves 12.29% of bugs in the SWE-bench evaluation set and takes just 1.5 minutes to run.

We accomplish these results by designing simple LM-centric commands and feedback formats to make it easier for the LM to browse the repository, view, edit and execute code files. We call this an Agent-Computer Interface (ACI) and build the SWE-agent repository to make it easy to iterate on ACI design for repository-level coding agents. Just like how typical language models requires good prompt eng

arrowKato 2024/04/05

issueをもとにコードを書いてくれるagentらしい。devinに近い性能

agent

リンク

crewAI - Platform for Multi AI Agents Systems

AI Agents reimagined for real use cases Most AI Agent frameworks are complex, but powerful. We provide the power with simplicity. We provide a platform and hope you create wonders.

arrowKato 2024/04/02

リンク

Intro to LLM Agents with Langchain: When RAG is Not Enough

Hello everyone, this article is a written form of a tutorial I conducted two weeks ago with Neurons Lab. If you prefer a narrative walkthrough, you can find the YouTube video here: As always, you can find the code on GitHub, and here are separate Colab Notebooks: Planning and reasoningDifferent types of memoriesVarious types of toolsBuilding complete agentsIntroduction to the agents Illustration b

arrowKato 2024/03/29

Langchain を使用した LLM エージェントの紹介: RAG が十分ではない場合。Agentを試しに追加してみるには、くらいで、ガッツリ性能を上げるみたいな話ではなさそう

RAG
Agent

リンク

What's next for AI agentic workflows ft. Andrew Ng of AI Fund

arrowKato 2024/03/29

1. Reflection, 2. Tool use, 3. Planning, 4. Multi-agent collaboration の11:00くらいのスライドは見るべし。　1,2は安定どころで、3.4が今発展しているところ

LLM
agent

リンク

いまこそ学ぶLLMベースのAIエージェント入門―基本的なしくみ／開発ツール／有名なOSSや論文の紹介

大規模言語モデル（LLM）の応用例として「AIエージェント」が大きな話題の1つとなっています。 AIエージェントは、与えられた目的に対して、何をすべきか自律的に判断して動作します。たとえば、必要に応じてWeb上の情報を検索して回答してくれたり、試行錯誤しながらプログラムを実装してくれたりします。 2024年2月現在では、OpenAIのAssistants APIやGPTs、Agents for Amazon BedrockやLangGraphなどがリリースされ、AIエージェントを開発するエコシステムも急速に発展しています。そんな中、この勉強会では「いまこそ学ぶLLMベースのAIエージェント入門」と題して、LLMベースのAIエージェントの基本を解説します。 LLMベースのAIエージェントの基本的なしくみ（MRKLやReActなど）や各種開発ツール、有名なOSSや論文で実装されたAIエージ

arrowKato 2024/03/01

RAGの次は、Agentが来るかもしれない。そのときに眺めると良さげな資料。サーベイ論文までついているのが個人的に好感触。

リンク

GitHub - microsoft/autogen: A programming framework for agentic AI. Discord: https://aka.ms/autogen-dc. Roadmap: https://aka.ms/autogen-roadmap

📚 Cite paper. 🔥 Mar 26: Andrew Ng gave a shoutout to AutoGen in What's next for AI agentic workflows at Sequoia Capital's AI Ascent. 🔥 Mar 3: What's new in AutoGen? 📰Blog; 📺Youtube. 🔥 Mar 1: the first AutoGen multi-agent experiment on the challenging GAIA benchmark achieved the No. 1 accuracy in all the three levels. 🎉 Jan 30: AutoGen is highlighted by Peter Lee in Microsoft Research Forum

arrowKato 2024/02/06

LLM同士で役割を設定を決めておしゃべりさせるやつ。

LLM
agent

リンク

はてなブックマーク

タグ

関連タグで絞り込む (9)

agentに関するarrowKatoのブックマーク (19)

お知らせ

今週のはてなブックマーク数ランキング（2024年7月第2週）

はてなブックマーク透明性レポート（2024年 2月-2024年4月）

今週のはてなブックマーク数ランキング（2024年7月第1週）

公式Twitter

キーボードショートカット一覧

はてなブックマーク

公式Twitter

はてなのサービス