本文「probability distribution function example」を検索

1 - 40 件 / 63件

新着順人気順

絞り込み

検索対象
ブックマーク数
期間
セーフサーチ

probability distribution function exampleの検索結果1 - 40 件 / 63件

GPT in 60 Lines of NumPy | Jay Mody
- 53 users
- jaykmody.com
- テクノロジー
- 2023/02/10
January 30, 2023 In this post, we'll implement a GPT from scratch in just 60 lines of numpy. We'll then load the trained GPT-2 model weights released by OpenAI into our implementation and generate some text. Note: This post assumes familiarity with Python, NumPy, and some basic experience with neural networks. This implementation is for educational purposes, so it's missing lots of features/improv
- Python
- ML
- あとで読む
- NLP
- 自然言語処理
- 機械学習
- *あとで
Why I no longer recommend Julia
- 37 users
- yuri.is
- テクノロジー
- 2022/05/17
For many years I used the Julia programming language for transforming, cleaning, analyzing, and visualizing data, doing statistics, and performing simulations. I published a handful of open-source packages for things like signed distance fields, nearest-neighbor search, and Turing patterns (among others), made visual explanations of Julia concepts like broadcasting and arrays, and used Julia to ma
- julia
- あとで読む
SARS-CoV-2 is associated with changes in brain structure in UK Biobank - Nature
- 33 users
- www.nature.com
- 世の中
- 2022/03/08
The global pandemic of SARS-CoV-2 has now claimed millions of lives across the world. There has been an increased focus by the scientific and medical community on the effects of mild-to-moderate COVID-19 in the longer term. There is strong evidence for brain-related pathologies, some of which could be a consequence of viral neurotropism1,2,14 or virus-induced neuroinflammation3,4,5,15, including t
What We Learned from a Year of Building with LLMs (Part I)
- 32 users
- www.oreilly.com
- テクノロジー
- 2024/05/30
It’s an exciting time to build with large language models (LLMs). Over the past year, LLMs have become “good enough” for real-world applications. The pace of improvements in LLMs, coupled with a parade of demos on social media, will fuel an estimated $200B investment in AI by 2025. LLMs are also broadly accessible, allowing everyone, not just ML engineers and scientists, to build intelligence into
Optimizing your LLM in production
- 28 users
- huggingface.co
- テクノロジー
- 2023/09/16
Note: This blog post is also available as a documentation page on Transformers. Large Language Models (LLMs) such as GPT3/4, Falcon, and LLama are rapidly advancing in their ability to tackle human-centric tasks, establishing themselves as essential tools in modern knowledge-based industries. Deploying these models in real-world tasks remains challenging, however: To exhibit near-human text unders
- LLM
- あとで読む
The Roadmap of Mathematics for Machine Learning
- 23 users
- www.tivadardanka.com
- テクノロジー
- 2021/08/15
Understanding math will make you a better engineer.So, I am writing the best and most comprehensive book about it. I'm interested Knowing the mathematics behind machine learning algorithms is a superpower. If you have ever built a model for a real-life problem, you probably experienced that familiarity with the details goes a long way if you want to move beyond baseline performance. This is especi
- 機械学習
- 数学
- あとで読む
- 統計
- 勉強
Prompt Engineering
- 21 users
- lilianweng.github.io
- テクノロジー
- 2023/03/20
Date: March 15, 2023 | Estimated Reading Time: 21 min | Author: Lilian Weng Prompt Engineering, also known as In-Context Prompting, refers to methods for how to communicate with LLM to steer its behavior for desired outcomes without updating the model weights. It is an empirical science and the effect of prompt engineering methods can vary a lot among models, thus requiring heavy experimentation a
microgpt
- 21 users
- karpathy.github.io
- テクノロジー
- 2026/02/21
This is a brief guide to my new art project microgpt, a single file of 200 lines of pure Python with no dependencies that trains and inferences a GPT. This file contains the full algorithmic content of what is needed: dataset of documents, tokenizer, autograd engine, a GPT-2-like neural network architecture, the Adam optimizer, training loop, and inference loop. Everything else is just efficiency.
- llm
- diy
- ai
- あとで読む
Patterns for Building LLM-based Systems & Products
- 16 users
- eugeneyan.com
- テクノロジー
- 2023/08/02
Patterns for Building LLM-based Systems & Products [ llm engineering production 🔥 ] · 66 min read Discussions on HackerNews, Twitter, and LinkedIn “There is a large class of problems that are easy to imagine and build demos for, but extremely hard to make products out of. For example, self-driving: It’s easy to demo a car self-driving around a block, but making it into a product takes a decade.”
- LLM
- LLMOps
- text
- あとで読む
Deep Learning for AI – Communications of the ACM
- 13 users
- cacm.acm.org
- テクノロジー
- 2021/07/16
How can neural networks learn the rich internal representations required for difficult tasks such as recognizing objects or understanding language? Yoshua Bengio, Yann LeCun, and Geoffrey Hinton are recipients of the 2018 ACM A.M. Turing Award for breakthroughs that have made deep neural networks a critical component of computing. Research on artificial neural networks was motivated by the observa
- 機械学習
- HotEntry
- 論文
- AI
- book
- あとで読む
Illustrating Reinforcement Learning from Human Feedback (RLHF)
- 11 users
- huggingface.co
- テクノロジー
- 2022/12/11
This article has been translated to Chinese 简体中文 and Vietnamese đọc tiếng việt. Language models have shown impressive capabilities in the past few years by generating diverse and compelling text from human input prompts. However, what makes a "good" text is inherently hard to define as it is subjective and context dependent. There are many applications such as writing stories where you want creati
- research
- あとで読む
How a simple Linux kernel memory corruption bug can lead to complete system compromise
- 11 users
- googleprojectzero.blogspot.com
- テクノロジー
- 2021/10/20
In this case, reallocating the object as one of those three types didn't seem to me like a nice way forward (although it should be possible to exploit this somehow with some effort, e.g. by using count.counter to corrupt the buf field of seq_file). Also, some systems might be using the slab_nomerge kernel command line flag, which disables this merging behavior. Another approach that I didn't look
- security
- kernel
- linux
- bug
- memory
- vulnerability
- google
LLM Powered Autonomous Agents
- 10 users
- lilianweng.github.io
- テクノロジー
- 2023/06/26
Date: June 23, 2023 | Estimated Reading Time: 31 min | Author: Lilian Weng Building agents with LLM (large language model) as its core controller is a cool concept. Several proof-of-concepts demos, such as AutoGPT, GPT-Engineer and BabyAGI, serve as inspiring examples. The potentiality of LLM extends beyond generating well-written copies, stories, essays and programs; it can be framed as a powerfu
- LLM
- AI
- language
GitHub - diff-usion/Awesome-Diffusion-Models: A collection of resources and papers on Diffusion Models
- 10 users
- github.com/diff-usion
- テクノロジー
- 2022/08/06
DiffEnc: Variational Diffusion with a Learned Encoder Beatrix M. G. Nielsen, Anders Christensen, Andrea Dittadi, Ole Winther arXiv 2023. [Paper] 30 Oct 2023 Upgrading VAE Training With Unlimited Data Plans Provided by Diffusion Models Tim Z. Xiao, Johannes Zenn, Robert Bamler arXiv 2023. [Paper] 30 Oct 2023 Successfully Applying Lottery Ticket Hypothesis to Diffusion Model Chao Jiang, Bo Hui, Boha
- 機械学習
- github
- image
- arxiv
Solving Quantitative Reasoning Problems With Language Models
- 9 users
- storage.googleapis.com
- テクノロジー
- 2022/07/01
Solving Quantitative Reasoning Problems with Language Models Aitor Lewkowycz∗, Anders Andreassen†, David Dohan†, Ethan Dyer†, Henryk Michalewski†, Vinay Ramasesh†, Ambrose Slone, Cem Anil, Imanol Schlag, Theo Gutman-Solo, Yuhuai Wu, Behnam Neyshabur∗, Guy Gur-Ari∗, and Vedant Misra∗ Google Research Abstract Language models have achieved remarkable performance on a wide range of tasks that require
- 論文
- AI
Attention Is Off By One
- 9 users
- www.evanmiller.org
- テクノロジー
- 2023/07/25
By Evan Miller July 24, 2023 About which one cannot speak, one must pass over in silence. –Wittgenstein Do you see the off-by-one error in this formula? \[ \textrm{Attention}(Q, K, V) = \textrm{softmax}\left(\frac{QK^T}{\sqrt{d}}\right)V \] The attention formula is the central equation of modern AI, but there’s a bug in it that has been driving me nuts the last week. I tried writing a serious-look
Andrej Karpathy — AGI is still a decade away
- 8 users
- www.dwarkesh.com
- テクノロジー
- 2025/10/20
The Andrej Karpathy episode. Andrej explains why reinforcement learning is terrible (but everything else is much worse), why model collapse prevents LLMs from learning the way humans do, why AGI will just blend into the previous ~2.5 centuries of 2% GDP growth, why self driving took so long to crack, and what he sees as the future of education. Watch on YouTube; listen on Apple Podcasts or Spotify
- 人工知能
- あとで読む
Blog
- 8 users
- eagledot.xyz
- テクノロジー
- 2025/11/30
Hachi: An (Image) Search engine Only the dead have seen the end of war .. George Santayana For quite some time now, i have been working on and off on a fully self-hosted search engine, in hope to make it easier to search across Personal data in an end to end manner. Even as individuals, we are hoarding and generating more and more data with no end in sight. Such "personal" data is being stored fro
- Software
Thinking Fast and Slow - Replicability-Index
- 8 users
- replicationindex.com
- 学び
- 2021/06/21
2011 was an important year in the history of psychology, especially social psychology. First, it became apparent that one social psychologist had faked results for dozens of publications (https://en.wikipedia.org/wiki/Diederik_Stapel). Second, a highly respected journal published an article with the incredible claim that humans can foresee random events in the future, if they are presented without
- 調査
- 科学
- 本
- まとめ
Generative Modeling by Estimating Gradients of the Data Distribution | Yang Song
- 7 users
- yang-song.net
- テクノロジー
- 2021/05/06
Introduction Existing generative modeling techniques can largely be grouped into two categories based on how they represent probability distributions. likelihood-based models, which directly learn the distribution’s probability density (or mass) function via (approximate) maximum likelihood. Typical likelihood-based models include autoregressive models , normalizing flow models , energy-based mode
- 機械学習
- AI
“Death of a Salesforce”: Why AI Will Transform the Next Generation of Sales Tech | Andreessen Horowitz
- 7 users
- a16z.com
- テクノロジー
- 2024/08/01
“Death of a Salesforce”: Why AI Will Transform the Next Generation of Sales Tech The battle between every startup and incumbent comes down to whether the startup gets distribution before the incumbent gets innovation. In sales tech, it’s easy to assume incumbents like Salesforce and Hubspot have the edge. First, they are embedded as “systems of record,” so sales leaders are loath to rip them out a
- ビジネス
- AI
RAPIDS Forest Inference Library: Prediction at 100 million rows per second
- 7 users
- medium.com
- テクノロジー
- 2021/05/11
IntroductionRandom forests (RF) and gradient-boosted decision trees (GBDTs) have become workhorse models of applied machine learning. XGBoost and LightGBM, popular packages implementing GBDT models, consistently rank among the most commonly used tools by data scientists on the Kaggle platform. We see similar interest in forest-based models in industry, where they are applied to problems ranging fr
What We’ve Learned From A Year of Building with LLMs – Applied LLMs
- 6 users
- applied-llms.org
- テクノロジー
- 2024/06/02
A practical guide to building successful LLM products, covering the tactical, operational, and strategic. It’s an exciting time to build with large language models (LLMs). Over the past year, LLMs have become “good enough” for real-world applications. And they’re getting better and cheaper every year. Coupled with a parade of demos on social media, there will be an estimated $200B investment in AI
- LLM
- research
- AI
- design
A decade of major cache incidents at Twitter
- 6 users
- danluu.com
- テクノロジー
- 2022/02/12
This was co-authored with Yao Yue This is a collection of information on severe (SEV-0 or SEV-1, the most severe incident classifications) incidents at Twitter that were at least partially attributed to cache from the time Twitter started using its current incident tracking JIRA (2012) to date (2022), with one bonus incident from before 2012. Not including the bonus incident, there were 6 SEV-0s a
- twitter
- Web
AI Timelines via Cumulative Optimization Power: Less Long, More Short — LessWrong
- 6 users
- www.lesswrong.com
- テクノロジー
- 2022/10/17
The general trend is clear: larger lifetime compute enables systems of greater generality and capability. Generality and performance are both independently expensive, as an efficient general system often ends up requiring combinations of many specialist subnetworks. BNNs and ANNs both implement effective approximations of bayesian learning[29]. Net training compute then measures the total intra-li
- AI
17 types of similarity and dissimilarity measures used in data science. | Towards Data Science
- 6 users
- towardsdatascience.com
- テクノロジー
- 2023/12/26
The following article explains various methods for computing distances and showing their instances in our daily lives. Additionally, it… Various ML metrics. Inspired by Maarten Grootendorst. "There is no Royal Road to Geometry." – Euclid Quick note: Everything written and visualized has been created by the author unless it was specified. Illustrations and equations were generated using tools like
Migrating Critical Traffic At Scale with No Downtime — Part 2
- 5 users
- netflixtechblog.com
- テクノロジー
- 2023/05/24
Shyam Gala, Javier Fernandez-Ivern, Anup Rokkam Pratap, Devang Shah Picture yourself enthralled by the latest episode of your beloved Netflix series, delighting in an uninterrupted, high-definition streaming experience. Behind these perfect moments of entertainment is a complex mechanism, with numerous gears and cogs working in harmony. But what happens when this machinery needs a transformation?
research!rsc: Transparent Telemetry for Open-Source Projects (Transparent Telemetry, Part 1)
- 5 users
- research.swtch.com
- テクノロジー
- 2023/02/08
Russ Cox February 8, 2023 research.swtch.com/telemetry-intro How do software developers understand which parts of their software are being used and whether they are performing as expected? The modern answer is telemetry, which means software sending data to answer those questions back to a collection server. This post is about why I believe telemetry is important for open-source projects, and what
- あとで読む
Understanding Convolutions on Graphs
- 5 users
- drafts.distill.pub
- 世の中
- 2021/05/04
Many systems and interactions - social networks, molecules, organizations, citations, physical models, transactions - can be represented quite naturally as graphs. How can we reason about and make predictions within these systems? One idea is to look at tools that have worked well in other domains: neural networks have shown immense predictive power in a variety of learning tasks. However, neural
FlashAttention-3: Fast and Accurate Attention with Asynchrony and Low-precision – PyTorch
- 5 users
- pytorch.org
- テクノロジー
- 2024/07/12
Blog FlashAttention-3: Fast and Accurate Attention with Asynchrony and Low-precision Attention, as a core layer of the ubiquitous Transformer architecture, is a bottleneck for large language models and long-context applications. FlashAttention (and FlashAttention-2) pioneered an approach to speed up attention on GPUs by minimizing memory reads/writes, and is now used by most libraries to accelerat
The Annotated Diffusion Model
- 5 users
- huggingface.co
- テクノロジー
- 2022/08/09
In this blog post, we'll take a deeper look into Denoising Diffusion Probabilistic Models (also known as DDPMs, diffusion models, score-based generative models or simply autoencoders) as researchers have been able to achieve remarkable results with them for (un)conditional image/audio/video generation. Popular examples (at the time of writing) include GLIDE and DALL-E 2 by OpenAI, Latent Diffusion
Keenadu the tablet conqueror and the links between major Android botnets
- 5 users
- securelist.com
- テクノロジー
- 2026/02/25
In April 2025, we reported on a then-new iteration of the Triada backdoor that had compromised the firmware of counterfeit Android devices sold across major marketplaces. The malware was deployed to the system partitions and hooked into Zygote – the parent process for all Android apps – to infect any app on the device. This allowed the Trojan to exfiltrate credentials from messaging apps and socia
- security
- Android
Aman's AI Journal • Primers • Ilya Sutskever's Top 30
- 5 users
- aman.ai
- テクノロジー
- 2024/11/04
Ilya Sutskever’s Top 30 Reading List The First Law of Complexodynamics The Unreasonable Effectiveness of Recurrent Neural Networks Understanding LSTM Networks Recurrent Neural Network Regularization Keeping Neural Networks Simple by Minimizing the Description Length of the Weights Pointer Networks ImageNet Classification with Deep Convolutional Neural Networks Order Matters: Sequence to Sequence f
- ai
Llama from scratch (or how to implement a paper without crying)
- 4 users
- blog.briankitano.com
- テクノロジー
- 2023/08/09
Llama from scratchI want to provide some tips from my experience implementing a paper. I'm going to cover my tips so far from implementing a dramatically scaled-down version of Llama for training TinyShakespeare. This post is heavily inspired by Karpathy's Makemore series, which I highly recommend. I'm only going to loosely follow the layout of their paper; while the formatting and order of sectio
How the RWKV language model works
- 4 users
- johanwind.github.io
- テクノロジー
- 2023/04/20
In this post, I will explain the details of how RWKV generates text. For a high level overview of what RWKV is and what is so special about it, check out the other post about RWKV. To explain exactly how RWKV works, I think it is easiest to look at a simple implementation of it. The following ~100 line code (based on RWKV in 150 lines) is a minimal implementation of a relatively small (430m parame
https://deeplearningtheory.com/PDLT.pdf
- 4 users
- deeplearningtheory.com
- テクノロジー
- 2021/06/19
The Principles of Deep Learning Theory An Effective Theory Approach to Understanding Neural Networks Daniel A. Roberts and Sho Yaida based on research in collaboration with Boris Hanin drob@mit.edu, shoyaida@fb.com ii Contents Preface vii 0 Initialization 1 0.1 An Effective Theory Approach . . . . . . . . . . . . . . . . . . . . . . . . 2 0.2 The Theoretical Minimum . . . . . . . . . . . . . . . .
- AI
- Book
Large Text Compression Benchmark
- 4 users
- www.mattmahoney.net
- テクノロジー
- 2024/09/22
Large Text Compression Benchmark Matt Mahoney Last update: Mar. 25, 2026. history This competition ranks lossless data compression programs by the compressed size (including the size of the decompression program) of the first 109 bytes of the XML text dump of the English version of Wikipedia on Mar. 3, 2006. About the test data. The goal of this benchmark is not to find the best overall compress
転移学習（TL：Transfer Learning）とFine Tuningの違いって？ - ts0818のブログ
- 4 users
- ts0818.hatenablog.com
- テクノロジー
- 2021/11/19
xtech.nikkei.com 人工知能（AI）の能力が人間を上回る領域が、より高度かつ複雑な方向へ拡大を続けている。2019年10月末には英ディープマインド（DeepMind）のAIが米ブリザードエンターテインメント（Blizzard Entertainment）のオンライン戦略ゲーム「StarCraft II」の対戦で大きな成果を上げたことが、欧米で話題となった。囲碁よりもオンライン戦略ゲームで人間に勝つことの方が、現実世界でのAI活用を目指す上で重要とされているためだ。グーグルのAIが「対戦ゲーム」で人間を倒した、囲碁での勝利より画期的な理由 | 日経クロステック（xTECH） ⇧ 地球外生命体が地球を侵略しに来るような事態が起こりえたとしても、AIが防衛してくれるっていうことですかね、夢広がりますね、どうもボクです。ということで、「多層ニューラルネットワーク」とかが絡んでくる
What We Learned from a Year of Building with LLMs (Part II)
- 4 users
- www.oreilly.com
- テクノロジー
- 2024/06/03
A possibly apocryphal quote attributed to many leaders reads: “Amateurs talk strategy and tactics. Professionals talk operations.” Where the tactical perspective sees a thicket of sui generis problems, the operational perspective sees a pattern of organizational dysfunction to repair. Where the strategic perspective sees an opportunity, the operational perspective sees a challenge worth rising to.
- あとで読む
US10452978B2 - Attention-based sequence transduction neural networks - Google Patents
- 4 users
- patents.google.com
- テクノロジー
- 2022/12/03
US10452978B2 - Attention-based sequence transduction neural networks - Google Patents Attention-based sequence transduction neural networks Download PDF Info Publication number US10452978B2 US10452978B2 US16/021,971 US201816021971A US10452978B2 US 10452978 B2 US10452978 B2 US 10452978B2 US 201816021971 A US201816021971 A US 201816021971A US 10452978 B2 US10452978 B2 US 10452978B2 Authority US Unit