サクサク読めて、アプリ限定の機能も多数!
トップへ戻る
パリ五輪
www.unofficialgoogledatascience.com
by STEVEN L. SCOTT Time series data are everywhere, but time series modeling is a fairly specialized area within statistics and data science. This post describes the bsts software package, which makes it easy to fit some fairly sophisticated time series models with just a few lines of R code. Introduction Time series data appear in a surprising number of applications, ranging from business, to the
by ERIC TASSONE, FARZAN ROHANI We were part of a team of data scientists in Search Infrastructure at Google that took on the task of developing robust and automatic large-scale time series forecasting for our organization. In this post, we recount how we approached the task, describing initial stakeholder needs, the business and engineering contexts in which the challenge arose, and theoretical an
By OMKAR MURALIDHARAN, NIALL CARDIN, TODD PHILLIPS, AMIR NAJMI Given recent advances and interest in machine learning, those of us with traditional statistical training have had occasion to ponder the similarities and differences between the fields. Many of the distinctions are due to culture and tooling, but there are also differences in thinking which run deeper. Take, for instance, how each fie
By PATRICK RILEY For a number of years, I led the data science team for Google Search logs. We were often asked to make sense of confusing results, measure new phenomena from logged behavior, validate analyses done by others, and interpret metrics of user behavior. Some people seemed to be naturally good at doing this kind of high quality data analysis. These engineers and analysts were often desc
By DAVID ADAMS Since inception, this blog has defined “data science” as inference derived from data too big to fit on a single computer. Thus the ability to manipulate big data is essential to our notion of data science. While MapReduce remains a fundamental tool, many interesting analyses require more than it can offer. For instance, the well-known Mantel-Haenszel estimator cannot be implemented
By IVAN DIAZ & JOSEPH KELLY Determining the causal effects of an action—which we call treatment—on an outcome of interest is at the heart of many data analysis efforts. In an ideal world, experimentation through randomization of the treatment assignment allows the identification and consistent estimation of causal effects. In observational studies treatment is assigned by nature, therefore its mec
by CHRIS HAULK It is sometimes useful to think of a large-scale online system ( LSOS ) as an abstract system with parameters $X$ affecting responses $Y$. Here, $X$ is a vector of tuning parameters that control the system's operating characteristics (e.g. the weight given to Likes in our video recommendation algorithm) while $Y$ is a vector of outcome measures such as different metrics of user expe
Replacing Sawzall — a case study in domain-specific language migration In a previous post, we described how data scientists at Google used Sawzall to perform powerful, scalable analysis. However, over the last three years we’ve eliminated almost all our Sawzall code, and now the niche that Sawzall occupied in our software ecosystem is mostly filled by Go. In this post, we’ll describe Sawzall’s rol
Despite Google’s technical achievements with big data, it may come as a surprise that there is no official Google blog for data science. True, Google Research puts out many academic papers and has a blog describing matters of interest to researchers. But what has been missing to date is a conversation about the nuts-and-bolts, the day-to-day of large scale analytical systems Google builds to serve
このページを最初にブックマークしてみませんか?
『The Unofficial Google Data Science Blog』の新着エントリーを見る
j次のブックマーク
k前のブックマーク
lあとで読む
eコメント一覧を開く
oページを開く