サクサク読めて、アプリ限定の機能も多数!
トップへ戻る
iPhone 16
multithreaded.stitchfix.com
Notice the abundance of missing values in the potential outcomes columns. Sadly, you can only observe one potential outcome at a time. For example, if you wanted to fill in the missing value for \(Y_{d=\neg i}\) on day 1, you’d literally need to travel back in time and not take ibuprofen to observe it. Absent a time machine, how might you use this null-value ridden data to estimate the future valu
In this installment of our “Patterns of Service-oriented Architecture” series, we’re going to talk about a complex concept called idempotency, and a technique you can apply to your service design to ensure that requested work is only performed once. Intent Prevent duplicate requests by allowing the Consumer of a Service to send a value that represents the uniqueness of a request, so that no reques
Stop Using word2vec When I started playing with word2vec four years ago I needed (and luckily had) tons of supercomputer time. But because of advances in our understanding of word2vec, computing word vectors now takes fifteen minutes on a single run-of-the-mill computer with standard numerical libraries1. Word vectors are awesome but you don’t need a neural network – and definitely don’t need deep
For those who attended my talk at Data Day Texas in Austin last weekend, you heard me talk about how Stitch Fix has reduced contention on: Access to data Access to ad-hoc compute resources to help scale Data Science. As attendees requested, I have posted my slides here, which you can find a link to at the bottom. For those that weren’t at my talk, here’s a brief background to the slides; they shou
One of the greatest benefits of working among a diverse group of data scientists and data engineers at Stitch Fix is how much we can learn from our peers. Usually that means getting ad hoc help with specific questions from the resident expert(s). But it also means getting advice on how best to fill any gaps in our own skill sets or knowledge bases, or just what interesting data science materials t
The goal of lda2vec is to make volumes of text useful to humans (not machines!) while still keeping the model simple to modify. It learns the powerful word representations in word2vec while jointly constructing human-interpretable LDA document representations. We fed our hybrid lda2vec algorithm (docs, code and paper ) every Hacker News comment through 2015. The results reveal what topics and tren
When people think of “data science” they probably think of algorithms that scan large datasets to predict a customer’s next move or interpret unstructured text. But what about models that utilize small, time-stamped datasets to forecast dry metrics such as demand and sales? Yes, I’m talking about good old time series analysis, an ancient discipline that hasn’t received the cool “data science” rebr
We’ve given up on “fat models, skinny controllers” as a design style for our Rails apps—in fact we abandoned it before we started. Instead, we factor our code into special-purpose classes, commonly called service objects. We’ve thrashed on exactly how these classes should be written, so this post is going to outline what I think is the most successful way to create a service object. Purpose of a S
Neural networks provide a vast array of functionality in the realm of statistical modeling, from data transformation to classification and regression. Unfortunately, due to the computational complexity and generally large magnitude of data involved, the training of so called deep learning models has been historically relegated to only those with considerable computing resources. However with the a
Justin Lee, Kurt Bollacker, Oz Raza, Ujjwal Sarin, and Alex Milowski on September 19, 2023 Our journey building our service deployment system and tools by leveraging Kubernetes and Knative Ariadne: building a custom observability UI for personalized search Navigating and troubleshooting complex ML and engineering systems, especially those with numerous components, can often be an intricate and cha
Introduction Imagine that you step into a room of data scientists; the dress code is casual and the scent of strong coffee is hanging in the air. You ask the data scientists if they regularly use generalized additive models (GAM) to do their work. Very few will say yes, if any at all. Now let’s replay the scenario, only this time we replace GAM with, say, random forest or support vector machines (
Pyxley Web-based dashboards are the most straightforward way to share insights with clients and business partners. For R users, Shiny provides a framework that allows data scientists to create interactive web applications without having to write Javascript, HTML, or CSS. Despite Shiny’s utility and success as a dashboard framework, there is no equivalent in Python. There are packages in developmen
Standard natural language processing (NLP) is a messy and difficult affair. It requires teaching a computer about English-specific word ambiguities as well as the hierarchical, sparse nature of words in sentences. At Stitch Fix, word vectors help computers learn from the raw text in customer notes. Our systems, composed of machines and human experts, need to recommend the maternity line when she s
このページを最初にブックマークしてみませんか?
『Stitch Fix Technology – Multithreaded』の新着エントリーを見る
j次のブックマーク
k前のブックマーク
lあとで読む
eコメント一覧を開く
oページを開く