サクサク読めて、アプリ限定の機能も多数!
トップへ戻る
衆院選
innovation.ebayinc.com
The Similarity Engine's use cases include item-to-item similarity for text and image modality and user-to-item personalized recommendations based on a user’s historical behavior data. Introduction Often, ecommerce marketplaces provide buyers with listings similar to those previously visited by the buyer, as well as a personalized shopping experience based on profiles, past shopping histories and b
eBay’s open source JavaScript UI framework modernizes universal web development. eBay was founded with the core belief that we should use technology to empower and connect people globally. In the technology world, we’re a core contributor to and believer in open source technology. Not only does a company culture of open source help us empower our developers, but it also enables our technologists t
In 2019, eBay prioritized a company-wide initiative, aptly called “Speed,” focused on improving the performance of critical eBay flows across all platforms — iOS, Android, and Web. This article explains the journey and outcomes. Death by a thousand cuts is a popular figure of speech that refers to a failure that occurs as a result of many small problems. It has a negative connotation to it and is
From the time it was announced, WebAssembly caused a huge buzz in the front-end world. The web community readily embraced the idea of taking code written in programming languages other than JavaScript and running that code in the browser. Above all WebAssembly consistently guarantees native speeds much faster than JavaScript. At eBay, we were no different. Our engineers were very excited about thi
Big Data Governance: Hive Metastore Listener for Apache Atlas Use Cases At eBay, we are obsessed with data quality and governance. Because eBay's Hadoop platform hosts 500 PB of data running over 15,000 nodes, the focus on governance is of utmost importance. This article discusses our experiences handling data governance at scale. Data governance helps ensure that high data quality exists througho
Elasticsearch is an open source search and analytic engine based on Apache Lucene that allows users to store, search, analyze data in near real time. While Elasticsearch is designed for fast queries, the performance depends largely on the scenarios that apply to your application, the volume of data you are indexing, and the rate at which applications and users query your data. This document summar
One of the top initiatives for eBay this year is to provide a compelling browse experience to our users. In a recent interview, Devin Wenig has given a good overview of why this matters to eBay. The idea is to leverage structured data and machine learning to allow users to shop across a whole spectrum of value, where some users might desire great savings, while others may want to focus on, say, be
Spot the difference….You can’t! To a sighted user it appears we have two identical button elements. A user interface control not only needs to look like a certain control, it must be described as that control too. Take for example a button, one of the simplest of controls. There are many ways you can create something that looks like a button, but unless you use the actual button tag (or button rol
Generally, data compression techniques are used to conserve space and network bandwidth. Widely used compression techniques include Gzip, bzip2, lzop, and 7-Zip. According to performance benchmarks, lzop is one of the fastest compression algorithms, while bzip2 has a high compression ratio but is very slow. Gzip offers the lowest level of compression. Gzip is based on the DEFLATE algorithm, which
Complex systems can fail in many ways and I find it useful to divide failures into two classes. The first consists of failures that can be anticipated (e.g. disk failures), will happen again, and can be directly checked for. The second class is failures that are unexpected. This post is about that second class. The tool for unexpected failures is statistics, hence the name for this post. A statis
Performance profiling of some of the eBay code base showed the logarithm (log) function to be consuming more CPU than expected. The implementation of log in modern math libraries is an ingenious wonder, efficiently computing a value that is correct down to the last bit. But for many applications, you’d be perfectly happy with an approximate logarithm, accurate to (say) 10 bits, especially if it we
We are happy to announce Pulsar – an open-source, real-time analytics platform and stream processing framework. Pulsar can be used to collect and process user and business events in real time, providing key insights and enabling systems to react to user activities within seconds. In addition to real-time sessionization and multi-dimensional metrics aggregation over time windows, Pulsar uses a SQL-
Empirical Bayes is a statistical technique that is both powerful and easy to use. My goal is to illustrate that statement via a case study using eBay data. Quoting the famous statistician Brad Efron, Empirical Bayes seems like the wave of the future to me, but it seemed that way 25 years ago and the wave still hasn’t washed in, despite the fact that it is an area of enormous potential importance
Async Fragments: Rediscovering Progressive HTML Rendering with Marko At eBay, we take site speed very seriously and are always looking for ways to allow developers to create faster-loading web apps. This involves fully understanding and controlling how web pages are delivered to web browsers. Progressive HTML rendering is a relatively old technique that can be used to improve the performance of we
This is a post about how to rank search results, specifically the list of items you get after doing a search on eBay. One simple approach to ranking is to give a score to each item. For concreteness, imagine the score to be an estimate of the probability that a user will buy the item. Each item is scored individually, and then they are presented in score order, with the item most likely to be purc
We are very excited to announce that eBay has released to the open-source community our distributed analytics engine: Kylin (http://kylin.io). Designed to accelerate analytics on Hadoop and allow the use of SQL-compatible tools, Kylin provides a SQL interface and multi-dimensional analysis (OLAP) on Hadoop to support extremely large datasets. Kylin is currently used in production by various busine
At eBay we run Hadoop clusters comprising thousands of nodes that are shared by thousands of users. We analyze data on these clusters to gain insights for improved customer experience. In this post, we look at distributing RPC resources fairly between heavy and light users, as well as mitigating denial of service attacks within Hadoop. By providing appropriate response times and increasing system
At eBay we want our customers to have the best experience possible. We use data analytics to improve user experiences, provide relevant offers, optimize performance, and create many, many other kinds of value. One way eBay supports this value creation is by utilizing data processing frameworks that enable, accelerate, or simplify data analytics. One such framework is Apache Spark. This post descri
In part I of this post we laid out in detail how to run a large Jenkins CI farm in Mesos. In this post we explore running the builds inside Docker containers and more: Overview Jenkins follows the master-slave model and is capable of launching tasks as remote Java processes on Mesos slave machines. Mesos is a cluster manager that provides efficient resource isolation and sharing across distributed
Problem statement In eBay’s existing CI model, each developer gets a personal CI/Jenkins Master instance. This Jenkins instance runs within a dedicated VM, and over time the result has been VM sprawl and poor resource utilization. We started looking at solutions to maximize our resource utilization and reduce the VM footprint while still preserving the individual CI instance model. After much deli
REST Commander: Scalable Web Server Management and Monitoring In the era of cloud and XaaS (everything as a service), REST/SOAP-based web services have become ubiquitous within eBay’s platform. We dynamically monitor and manage a large and rapidly growing number of web servers deployed on our infrastructure and systems. However, existing tools present major challenges when making REST/SOAP calls w
The basic purpose of HTTP caching is to provide a mechanism for applications to scale better and perform faster. But HTTP caching is applicable only to idempotent requests, which makes a lot of sense; only idempotent and nullipotent requests yield the same result when run multiple times. In the HTTP world, this fact means that GET requests can be cached but POST requests cannot. However, there can
In the first part, we covered a few fundamental practices and walked through a detailed example to help you get started with Cassandra data model design. You can follow Part 2 without reading Part 1, but I recommend glancing over the terms and conventions I’m using. If you’re new to Cassandra, I urge you to read Part 1. September 2014 Update: Readers should note that this article describes data mo
This is the first in a series of posts on Cassandra data modeling, implementation, operations, and related practices that guide our Cassandra utilization at eBay. Some of these best practices we’ve learned from public forums, many are new to us, and a few still are arguable and could benefit from further experience. September 2014 Update: Readers should note that this article describes data modeli
We are happy to announce ql.io – a declarative, evented, data-retrieval and aggregation gateway for HTTP APIs. Through ql.io, we want to help application developers increase engineering clock speed and improve end user experience. ql.io can reduce the number of lines of code required to call multiple HTTP APIs while simultaneously bringing down network latency and bandwidth usage in certain use ca
Historical user click patterns on search result pages are considered a great resource for algorithms that attempt to learn to rank search results. This ranking method is a well-studied problem in the field of Information Retrieval (IR). See Mike Mathiesonʼs excellent blog post, Using Behavioral Data to Improve Search, for more background on this subject. If learn-to-rank algorithms could explicitl
In the inaugural topic of this blog on site speed, Hugh Williams spoke briefly about image spriting as a technique to speed up the rendering of a page. CSS Image Sprites are now commonly used to remove the performance problems created by making separate HTTP requests for every image on a web page. Unfortunately the technique cannot be used on a web page with a dynamic image set such as the eBay se
Groot: eBay’s Event-graph-based Approach for Root Cause Analysis The framework achieves great coverage and performance across different incident triaging scenarios, and also outperforms other state-of-the-art root cause analysis methodologies.
このページを最初にブックマークしてみませんか?
『innovation.ebayinc.com』の新着エントリーを見る
j次のブックマーク
k前のブックマーク
lあとで読む
eコメント一覧を開く
oページを開く