並び順

ブックマーク数

期間指定

  • から
  • まで

1 - 1 件 / 1件

新着順 人気順

measuringの検索結果1 - 1 件 / 1件

  • GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models

    Recent advancements in Large Language Models (LLMs) have sparked interest in their formal reasoning capabilities, particularly in mathematics. The GSM8K benchmark is widely used to assess the mathematical reasoning of models on grade-school-level questions. While the performance of LLMs on GSM8K has significantly improved in recent years, it remains unclear whether their mathematical reasoning cap

    1