本文「apache spark source code github」を検索

1 - 28 件 / 28件

新着順人気順

絞り込み

検索対象
ブックマーク数
期間
セーフサーチ

apache spark source code githubの検索結果1 - 28 件 / 28件

GitHub - modelcontextprotocol/servers: Model Context Protocol Servers
- 60 users
- github.com/modelcontextprotocol
- テクノロジー
- 2024/11/28
Official integrations are maintained by companies building production ready MCP servers for their platforms. 21st.dev Magic - Create crafted UI components inspired by the best 21st.dev design engineers. 2slides - An MCP server that provides tools to convert content into slides/PPT/presentation or generate slides/PPT/presentation with user intention. ActionKit by Paragon - Connect to 130+ SaaS inte
- MCP
- AI
- LLM
- Anthropic
- server
- protocol
- github
- プログラミング
awesome-scalability
- 50 users
- binhnguyennus.github.io
- テクノロジー
- 2025/10/17
The Patterns of Scalable, Reliable, and Performant Large-Scale Systems View the Project on GitHub View On GitHub An updated and organized reading list for illustrating the patterns of scalable, reliable, and performant large-scale systems. Concepts are explained in the articles of prominent engineers and credible references. Case studies are taken from battle-tested systems that serve millions to
Things we learned about LLMs in 2024
- 28 users
- simonwillison.net
- テクノロジー
- 2025/01/01
31st December 2024 A lot has happened in the world of Large Language Models over the course of 2024. Here’s a review of things we figured out about the field in the past twelve months, plus my attempt at identifying key themes and pivotal moments. This is a sequel to my review of 2023. In this article: The GPT-4 barrier was comprehensively broken Some of those GPT-4 models run on my laptop LLM pri
- LLM
- あとで読む
Update for Apache Log4j2 Issue (CVE-2021-44228)
- 15 users
- aws.amazon.com
- テクノロジー
- 2021/12/12
AWS is aware of the recently disclosed issues relating to the open-source Apache “Log4j2" utility (CVE-2021-44228 and CVE-2021-45046). Responding to security issues such as this one shows the value of having multiple layers of defensive technologies, which is so important to maintaining the security of our customers’ data and workloads. We've taken this issue very seriously, and our world-class te
- log4j
- aws
- security
- tech
- web
The inside story on Mountpoint for Amazon S3, a high-performance open source file client | Amazon Web Services
- 9 users
- aws.amazon.com
- テクノロジー
- 2023/03/15
AWS Storage Blog The inside story on Mountpoint for Amazon S3, a high-performance open source file client UPDATE (8/9/2023): Mountpoint for Amazon S3 is now generally available. For details, please read the What’s New post. Amazon S3 is the best place to build data lakes because of its durability, availability, scalability, and security. Hundreds of thousands of data lakes are built on S3, storing
- S3
- performance
- aws
ベンダーロックインを考える - Qiita
- 8 users
- qiita.com/dahatake
- テクノロジー
- 2021/06/11
更新記録 2021/6/17 - 「Cloud 型ベンダーロックイン」「Cloud Native DB」について加筆 2021/6/16 - 「OSS」について加筆 2021/6/14 - 「業界標準」について補足はじめに IT企業 = ベンダーロックインの塊プラットフォーマー = ベンダーロックインの塊残念ですが、その視点の方は、多くいらっしゃいます。ソフトウェア自身が期待していたほど正しく動作しなかった、もっと言うと枯れていなかった時代には、それしか選択肢が無かったかもしれません。 IT業界は Dog Year だと言われて久しいわけですが、Cloud 全盛の今。ベンダーの儲けどころは大きく変わっています。ベンダーロックインは「囲い込み戦略」であり、その負の部分の方が大きい事をベンダーは知っています。定義ベンダーロックインの定義を Wikipedia から拾ってみます。 W
- あとで読む
Introduction to Zig
- 7 users
- pedropark99.github.io
- テクノロジー
- 2024/10/03
Welcome Welcome! This is the initial page for the “Open Access” HTML version of the book “Introduction to Zig: a project-based book”, written by Pedro Duarte Faria. This is an open book that provides an introduction to the Zig programming language, which is a new general-purpose, and low-level language for building robust and optimal software. Support the project! If you like this project, and you
- Zig
- Book
Netflix System Design- Backend Architecture
- 6 users
- dev.to/gbengelebs
- テクノロジー
- 2021/06/24
Cover Photo by Alexander Shatov on Unsplash Netflix accounts for about 15% of the world's internet bandwidth traffic, serving over 6 billion hours of content per month to nearly every country in the world. Building a robust, highly scalable, reliable, and efficient backend system is no small engineering feat, but the ambitious team at Netflix has proven that problems exist to be solved. This artic
Tech Solvency: The Story So Far: CVE-2021-44228 (Log4Shell log4j vulnerability).
- 6 users
- www.techsolvency.com
- テクノロジー
- 2021/12/14
Log4Shell log4j vulnerability (CVE-2021-44228 / CVE-2021-45046) - cheat-sheet reference guide Last updated: $Date: 2022/02/08 23:26:16 $ UTC - best effort, validate all for your environment/model before use, unofficial sources may be wrong by @TychoTithonus (Royce Williams), standing on the shoulders of many giants Send updates or suggestions (please include category / context / public (or support
- Security
Why We Use Julia, 10 Years Later
- 5 users
- julialang.org
- テクノロジー
- 2022/02/15
Exactly ten years ago today, we published "Why We Created Julia", introducing the Julia project to the world. At this point, we have moved well past the ambitious goals set out in the original blog post. Julia is now used by hundreds of thousands of people. It is taught at hundreds of universities and entire companies are being formed that build their software stacks on Julia. From personalized me
- Julia
Databases in 2021: A Year in Review
- 5 users
- www.cs.cmu.edu/~pavlo
- テクノロジー
- 2021/12/29
It was a wild year for the database industry, with newcomers overtaking the old guard, vendors fighting over benchmark numbers, and eye-popping funding rounds. We also had to say goodbye to some of our database friends through acquisitions, bankruptcies, or retractions. As the end of the year draws near, it’s worth reflecting and taking stock as we move into 2022. Here are some of the highlights a
Databricks記事のまとめページ(その1) - Qiita
- 5 users
- qiita.com/taka_yayoi
- テクノロジー
- 2021/10/20
Databricksイベント Databricksセミナー・ハンズオンまとめページ Databricks Data + AI Summit 2024バーチャルセッションのご紹介 Databricks年次イベント「DATA + AI WORLD TOUR JAPAN 2022」のご案内 DATA + AIサミット2022のご案内 Data + AIサミットで何が起こるのか：オープンソース、テクニカルキーノートなどなど！ Data + AIサミット2021で発表されたDatabricksの新機能 Data + AIサミットで発表された重要ニューストップ10 Data & AI Summit 2022におけるDatabricksレイクハウスプラットフォーム発表の振り返り Data & AIサミットにおけるDatabricks SQLのハイライト JEDAI勉強会第2回: エンドツーエンド・レコ
Applied-ML Papers
- 5 users
- applyingml.com
- テクノロジー
- 2022/08/16
Curated papers, articles, and blogs on machine learning in production. Designing your ML system? Learn how other organizations did it. Star Table of Contents Data QualityData EngineeringData DiscoveryFeature StoresClassificationRegressionForecastingRecommendationSearch & RankingEmbeddingsNatural Language ProcessingSequence ModellingComputer VisionReinforcement LearningAnomaly DetectionGraphOptimiz
- あとで読む
Apache Airflow : 10 rules to make it work ( scale ) | Towards Data Science
- 5 users
- towardsdatascience.com
- テクノロジー
- 2022/02/18
Airflow is by default very permissive and without strict rules you are likely to create a chaotic code base that is impossible to scale and administrate. if you are not careful your shortcuts will cost you a lot afterwards Airflow permissive approach will let you schedule any custom code (jobs) but you will create a spaghetti stack if you do not follow very strict SEPARATION OF CONCERN design betw
- Airflow
Speeding up Rust semver-checking by over 2000x
- 4 users
- predr.ag
- テクノロジー
- 2023/02/08
This post describes work in progress: how cargo-semver-checks will benefit from the upcoming query optimization API in the Trustfall query engine. Read on to learn how a modern linter works under the hood, and how ideas from the world of databases can improve its performance. Today, cargo-semver-checks is good enough to prevent real-world semver violations, and fast enough to earn a spot in the re
- Rust
Don’t call it a comeback: Why Java is still champ
- 4 users
- github.com/readme
- テクノロジー
- 2022/08/10
No matter what ranking system you look at, whether the TIOBE Index, the Popularity of Programming Language Index, RedMonk’s bi-annual language rankings, or GitHub’s yearly State of the Octoverse, Java has been sitting among the top three languages since shortly after its launch in 1995. To listen to the general scuttlebutt of the developer crowd over time, however, you might think that Java was in
- あとで読む
Introduction - PyO3 user guide
- 3 users
- pyo3.rs
- テクノロジー
- 2024/07/22
Press ← or → to navigate between chapters Press S or / to search in the book Press ? to show this help Press Esc to hide this help The PyO3 user guide Welcome to the PyO3 user guide! This book is a companion to PyO3's API docs. It contains examples and documentation to explain all of PyO3's use cases in detail. The rough order of material in this user guide is as follows: Getting started Wrapping
Build incremental data pipelines to load transactional data changes using AWS DMS, Delta 2.0, and Amazon EMR Serverless | Amazon Web Services
- 3 users
- aws.amazon.com
- テクノロジー
- 2023/03/04
AWS Big Data Blog Build incremental data pipelines to load transactional data changes using AWS DMS, Delta 2.0, and Amazon EMR Serverless Building data lakes from continuously changing transactional data of databases and keeping data lakes up to date is a complex task and can be an operational challenge. A solution to this problem is to use AWS Database Migration Service (AWS DMS) for migrating hi
- あとで読む
Dive deep into AWS Glue 4.0 for Apache Spark | Amazon Web Services
- 3 users
- aws.amazon.com
- テクノロジー
- 2023/05/19
AWS Big Data Blog Dive deep into AWS Glue 4.0 for Apache Spark Jul 2023: This post was reviewed and updated with Glue 4.0 support in AWS Glue Studio notebook and interactive sessions. Deriving insight from data is hard. It’s even harder when your organization is dealing with silos that impede data access across different data stores. Seamless data integration is a key requirement in a modern data
Code versioning using AWS Glue Studio and GitHub | Amazon Web Services
- 3 users
- aws.amazon.com
- テクノロジー
- 2022/10/14
AWS Big Data Blog Code versioning using AWS Glue Studio and GitHub AWS Glue now offers integration with Git, an open-source version control system widely used across the developer community. Thanks to this integration, you can incorporate your existing DevOps practices on AWS Glue jobs. AWS Glue is a serverless data integration service that helps you create jobs based on Apache Spark or Python to
- github
Argo Workflows - The workflow engine for Kubernetes
- 3 users
- argo-workflows.readthedocs.io
- テクノロジー
- 2022/12/26
Home Home Getting Started User Guide Operator Manual Developer Guide Roadmap Blog ⧉ Slack ⧉ Twitter ⧉ LinkedIn ⧉ Home What is Argo Workflows?¶ Argo Workflows is an open source container-native workflow engine for orchestrating parallel jobs on Kubernetes. Argo Workflows is implemented as a Kubernetes CRD (Custom Resource Definition). Define workflows where each step is a container. Model multi-ste
Track Awesome List Updates Daily
- 3 users
- www.trackawesomelist.com
- テクノロジー
- 2022/01/09
Track Awesome List Updates DailyWe track over 500 awesome list updates, and you can also subscribe to daily or weekly updates via RSS or News Letter. This repo is generated by trackawesomelist-source, visit it Online or with Github. 📅 Weekly · 🔍 Search · 🔥 Feed · 📮 Subscribe · ❤️ Sponsor · 😺 Github · 🌐 Website · 📝 07/29 · ✅ 07/29 Table of Contents Recently Updated Top 50 Awesome List All Tr
Migrating data from Google BigQuery to Amazon S3 using AWS Glue custom connectors | Amazon Web Services
- 3 users
- aws.amazon.com
- テクノロジー
- 2021/01/21
AWS Big Data Blog Migrating data from Google BigQuery to Amazon S3 using AWS Glue custom connectors July, 2022: This post was reviewed and updated to include a mew data point on the effective runtime with the latest version, explaining Glue 3,0 and autoscaling. October, 2024: In Glue 4.0 we have introduced a native and managed connector for Google BigQuery. You can follow the instruction in the bl
Rill | The Open Table Format Revolution: Why Hyperscalers Are Betting on Managed Iceberg
- 3 users
- www.rilldata.com
- テクノロジー
- 2025/05/18
Wondering why open table formats are suddenly booming? Why is AWS investing heavily in making Iceberg tables on S3, and why did Databricks pay a reported $2B to acquire Tabular? The answers might change how we think about data architecture. Historically, object storage like Amazon S3 or R2 was used as inexpensive, scalable storage for unstructured files, while structured data typically went to dat
The Easiest Way to Find CVEs at the Moment? GitHub Dorks!
- 3 users
- medium.com/@dub-flow
- テクノロジー
- 2024/02/18
In this article, I will demonstrate how I used GitHub dorks to find 24 vulnerabilities in popular open-source projects in just a few weeks while only spending time in the evenings and the weekends (see https://github.com/dub-flow/vulnerability-research for information on all my CVEs). Before starting this journey, I had already found one CVE: A stored XSS vulnerability in Apache Spark. Around last
Azure Updates (2022.10.13 / Microsoft Ignite 2022)
- 3 users
- blog.azure.moe
- テクノロジー
- 2022/10/13
Ignite関連のアップデート他いろいろ。 Ignite関連記事公式文書といえばこれ。Microsoft Ignite 2022 Book of News How Microsoft Azure helps drive agility and optimization for your business Microsoft Ignite: A showcase of products to help customers be more efficient and productive 5 cybersecurity capabilities announced at Microsoft Ignite 2022 What’s new in Azure Network Security at Microsoft Ignite 2022 Modernize with Microsoft Clo
A non-beginner Data Engineering Roadmap — 2025 Edition
- 3 users
- blog.det.life
- テクノロジー
- 2025/02/25
Me after years using python.Before starting this post, I want to acknowledge that soft and hard skills are equally important. Data people exist to deliver business value, or more broadly read facts from a pool of ever-growing data. But, even with a bunch of posts talking about soft skills, at the end of the day, we're being paid for the technical skills we have, and the ability we have to deliver
- あとで読む
分散処理OSSへのコントリビューション in 2023 - おくみん公式ブログ
- 3 users
- blog.okumin.com
- テクノロジー
- 2023/12/25
Contributions to Apache Hive 2023年に取り組んだ分散処理OSSに対する貢献のまとめです。今年はApache Hiveのコミュニティが活性化したのでHiveやTezに対する貢献が多めです。この記事は『Distributed computing (Apache Spark, Hadoop, Kafka, ...)のカレンダー | Advent Calendar 2023 - Qiita』24日目として執筆しました。若干遅れて申し訳ございません。データ不整合の解消ネストしたCTEをマテリアライズするとデータが消失する問題 LIMIT OFFSET Pushdownのバグ修正パフォーマンス改善 Auto Reduce Parallelismの改善 Fair Routingの開発ジェネリックなAM or TaskレベルのフックをTezに追加 UDTFの出力に
- あとで読む