pascal256のブックマーク - はてなブックマーク

Stop “vibe testing” your LLMs. It's time for real evals.- Google Developers Blog
Stop “vibe testing” your LLMs. It's time for real evals. If you’re building with LLMs, you know the drill. You tweak a prompt, run it a few times, and... the output feels better. But is it actually better? You're not sure. So you keep tweaking, caught in a loop of “vibe testing” that feels more like art than engineering. This uncertainty exists for a simple reason: unlike traditional software, AI
pascal256 2025/08/28
リンク
How it’s Made: Interacting with Gemini through multimodal prompting- Google Developers Blog
Let’s try an experiment. We’ll show this picture to our multimodal model Gemini and ask it to describe what it sees: Tell me what you see Gemini: I see a person's right hand. The hand is open with the fingers spread apart.
pascal256 2023/12/11
リンク
1

はてなブックマーク