Designing a Production-Ready Kappa Architecture for Timely Data Stream Processing At Uber, we use robust data processing systems such as Apache Flink and Apache Spark to power the streaming applications that helps us calculate up-to-date pricing, enhance driver dispatching, and fight fraud on our platform. Such solutions can process data at a massive scale in real time with exactly-once semantics,
Kafka serialisation schemes — playing with AVRO, Protobuf, JSON Schema in Confluent Streaming Platform. The code for these examples available at https://github.com/saubury/kafka-serialization Apache Avro was has been the default Kafka serialisation mechanism for a long time. Confluent just updated their Kafka streaming platform with additional support for serialising data with Protocol buffers (or
ETL (Extract Transform Load)What not to expect from this Blog? Managed ETL solutions like AWS Glue, AWS Data Migration Service or Apache Airflow. Cloud-based techniques are managed but not free. And are not covered in this article. Table of contentsWhat is an ETL pipeline?What are the various use cases of an ETL pipeline?ETL prerequisites — Docker + Debezium + Kafka + Kafka Connect — Bird’s-eye vi
SE Radio 393: Jay Kreps on Enterprise Integration Architecture with a Kafka Event Log Jay Kreps, CEO of Confluent discusses an enterprise integration architecture organized around an event log. Robert Blumen spoke with Jay about the N-squared problem of data integration; how LinkedIn tried and failed to solve the integration problem; the nature of events; the enterprise event schema; schema defin
ヤフー株式会社は、2023年10月1日にLINEヤフー株式会社になりました。LINEヤフー株式会社の新しいブログはこちらです。LINEヤフー Tech Blog こんにちは。ヤフーの橘(@moja_0316)です。 私は2018年に新卒でデータ統括本部に入社し、データパイプライン領域でエンジニアとして働いています。 今日は皆さんにヤフーのデータパイプラインの役割と、私たちが取り組んだデータパイプラインの信頼性を高める取り組みについてご紹介します。 ヤフーのデータパイプライン ヤフーは検索やEコマース、ニュースをはじめとした多くのサービスを運営しています。それらのサービスが保持するデータは非常に量が多く、かつ価値の高いものです。特に近年はデータソリューションサービスをはじめとして、さまざまなサービスのデータを横断して適切に利活用することで皆様の生活をより便利にする取り組みを多く始めています。
Real-time data analysis and management are increasingly critical for today`s businesses. SQL is the de facto lingua franca for these endeavors, yet support for robust streaming analysis and management with SQL remains limited. Many approaches restrict semantics to a reduced subset of features and/or require a suite of non-standard constructs. Additionally, use of event timestamps to provide native
In previous posts, I had discussed how Kafka can be used for both stream-relational processing as well as graph processing. I will now show how Kafka can also be used to process Datalog programs. The Datalog Language Datalog 1 is a declarative logic programming language that is used as the query language for databases such as Datomic and LogicBlox.2 Both relational queries and graph-based queries
リリース、障害情報などのサービスのお知らせ
最新の人気エントリーの配信
処理を実行中です
j次のブックマーク
k前のブックマーク
lあとで読む
eコメント一覧を開く
oページを開く