[B! compression] manboubirdのブックマーク

manboubird id:manboubird

compressionに関するmanboubirdのブックマーク (39)

A Comprehensive Survey of Compression Algorithms for Language Models
manboubird 2024/02/02
llm

generativeAi

compression

model

paper

survey
リンク
Zstandard - Real-time data compression algorithm
Zstandard is a fast compression algorithm, providing high compression ratios. It also offers a special mode for small data, called dictionary compression. The reference library offers a very wide range of speed / compression trade-off, and is backed by an extremely fast decoder (see benchmarks below). Zstandard library is provided as open source software using a BSD license. Its format is stable a
manboubird 2023/08/28
zstd

compression

meta

algorithm
リンク
Firefox's Optimized Zip Format: Reading Zip Files Really Quickly
Firefox's Optimized Zip Format: Reading Zip Files Really Quickly This post is about minimizing amount of disk IO and CPU overhead when reading Zip files. I recently saw an article about a new format that was faster than zip. This is quite surprising as to my mind, zip is one of the most flexible and low-overhead formats I’ve encountered. Some googling showed me that over past 11 years people have
manboubird 2021/11/25
compression

zip

spec
リンク
Read the contents of a zipped file without extraction?
manboubird 2019/12/04
unzip

zip

linux

zcat

compression

cli
リンク
Zstandard - Real-time data compression algorithm
Zstandard is a fast compression algorithm, providing high compression ratios. It also offers a special mode for small data, called dictionary compression. The reference library offers a very wide range of speed / compression trade-off, and is backed by an extremely fast decoder (see benchmarks below). Zstandard library is provided as open source software using a BSD license. Its format is stable a
manboubird 2018/06/27
zstd

compression

streaming

facebook
リンク
Zarr-Python — zarr 2.16.1 documentation
manboubird 2018/06/25
zarr

python

serde

compression
リンク
GitHub - smihica/pyminizip: To create a password encrypted zip file in python.
manboubird 2018/03/30
python

zip

compression
リンク
"チープ"にビッグデータを扱うのならMessagePack＋LZ4がいい感じ【データベースと対決】 - Qiita
皆さんはビッグデータを扱うときどのような形式で保存していますか？ここでいうビッグデータとは数GB～数十GB（笑）のJSONです。Mongo DBのようなNoSQLなデータベース使う？素晴らしいと思います。PostgreSQLでJSONを使う？とても良いと思います。ここでは、データベースという枠組みから外れて、「ファイルシステム」を中心に手軽にお安く（ここポイント）ビッグデータを扱うことを考えます。なので、この方法は最速ではありませんし、個人がちょっと遊んでみようというときに気楽にできる”チープ”な物です1。企業でやるならちゃんとしたデータベースを使うべきです。その前提で読んでみてください（ちょっと長いです）。ファイルシステムは、テキストファイルやZip アーカイブといったただのファイルです。ただのファイルなので、データベースが得意なインデックスも効きませんし、検索や結合も弱いですし、同時接
manboubird 2018/02/25
messagePack

lz4

compression

comparizon

serde
リンク
How Uber Engineering Evaluated JSON Encoding and Compression Algorithms to Put the Squeeze on Trip Data
For compression, we put three lossless and widely accepted libraries to the test: Snappy zlib Bzip2 (BZ2) Snappy aims to provide high speeds and reasonable compression. BZ2 trades speed for better compression, and zlib falls somewhere between them. Testing Our goal was to find the combination of encoding protocol and compression algorithm with the most compact result at the highest speed. We teste
manboubird 2018/02/25
uber

json

compression

messagePack

serde

comparizon

zlib
リンク
GZinga: Seekable and Splittable Gzip
Generally, data compression techniques are used to conserve space and network bandwidth. Widely used compression techniques include Gzip, bzip2, lzop, and 7-Zip. According to performance benchmarks, lzop is one of the fastest compression algorithms, while bzip2 has a high compression ratio but is very slow. Gzip offers the lowest level of compression. Gzip is based on the DEFLATE algorithm, which
manboubird 2017/06/02
gzinga

compression

ebay

gzip
リンク
モデル圧縮 - nico-opendata
nico-opendata niconicoでは、学術分野における技術発展への寄与を目的として、研究者の方を対象に各種サービスのデータを公開しています。ニコニコ動画コメント等データセット (株)ドワンゴ及び(有)未来検索ブラジルと国立情報学研究所が協力して研究者に提供しているデータセットです。ニコニコ動画コメント等のデータが利用可能です。利用申請フォーム※国立情報学研究所へリンクしますニコニコ大百科データ (株)ドワンゴ及び(有)未来検索ブラジルと国立情報学研究所が協力して研究者に提供しているデータセットです。ニコニコ大百科のデータが利用可能です。利用申請フォーム※国立情報学研究所へリンクします Nico-Illustデータセット Comicolorization: Semi-Automatic Manga Colorization Chie Furusawa*、Kazuyu
manboubird 2017/05/21
keras

model

compression

deeplearning
リンク
Introducing Brotli: a new compression algorithm for the internet
The latest news from Google on open source releases, major projects, events, and student outreach programs. At Google, we think that internet users’ time is valuable, and that they shouldn’t have to wait long for a web page to load. Because fast is better than slow, two years ago we published the Zopfli compression algorithm. This received such positive feedback in the industry that it has been in
manboubird 2017/02/08
comparison

compression

algorithm

brotli

google

bzip
リンク
lzop vs compress vs gzip vs bzip2 vs lzma vs lzma2/xz benchmark, reloaded
manboubird 2014/11/05
bzip

xz

compression

comparizon
リンク
XZ Utils
XZ Utils is a complete C99 implementation of the .xz file format. XZ Utils were originally written for POSIX systems, but has been ported to a few non-POSIX systems over the years. XZ Utils consist of several components: liblzma is a compression library with an API similar to that of zlib. xz is a command line tool with syntax similar to that of gzip. xzdec is a decompression-only tool smaller tha
manboubird 2014/09/11
compression

xz
リンク
Parallel xz compression
manboubird 2014/06/19
gnuParallel

bzip

pbzip2

compression
リンク
MySQLの新しいInnoDB ページI/O圧縮機能について解析してみた
InnoDBにはデータの圧縮機能がありますが、パフォーマンスが低いことからあまり使われていません。ただ今年の Percona Live で Oracle MySQL, MariaDB, そして Percona Server が新しい InnoDB Compression を出してきました。これはFusion-ioの R&D チームがフラッシュストレージ向けの MySQL 高速化の一環で開発したパッチが元になっています。ちなみに私は Fusion-io の社員ですのでこの発表をワクテカして待っていたのですが、折角コードが一般にリリースされたので、ソースコードを眺めて動作を調べることにしました。参考にしたのは MySQL Server Snapshots (labs.mysql.com) にあるMySQL with InnoDB PageIO Compression のソースコード、およびM
manboubird 2014/04/05
innodb

mysql

compression
リンク
Hadoop splittable-lzo-compression
Data-Intensive Text Processing with MapReduce(Ch1,Ch2)Sho Shimauchi
manboubird 2013/07/03
lzo

compression

cloudera

hadoop
リンク
Best splittable compression for Hadoop input = bz2?
We've realized a bit too late that archiving our files in GZip format for Hadoop processing isn't such a great idea. GZip isn't splittable, and for reference, here are the probl ems which I won't repeat: Very basic question about Hadoop and compressed input files Hadoop gzip compressed files Hadoop gzip input file using only one mapper Why can't hadoop split up a large text file and then compress t
manboubird 2013/06/18
hadoop

compression

gzip
リンク
Cloudera Blog
Riding the wave of the generative AI revolution, third party large language model (LLM) services like ChatGPT and Bard have swiftly emerged as the talk of the town, converting AI skeptics to evangelists and transf orming the way we interact with techno logy. For proof of this megatrend look no further than the instant success of ChatGPT, […] Read blog post
manboubird 2013/06/17
lzo

compression

cloudera

hadoop
リンク
HDFS and Hive storage - comparing file formats and compression methods
Em ail addressNever miss our publications about Open Source, big data and distributed systems, low frequency of one em ail every two months. A few days ago, we have conducted a test in order to compare various Hive file formats and compression methods. Among those file formats, some are native to HDFS and apply to all Hadoop users. The test suite is composed of similar Hive queries which create a ta
manboubird 2013/04/16
hive

compression
リンク
1 2 次のページ

お知らせ

もっと読む

公式Twitter

@HatenaBookmark
リリース、障害情報などのサービスのお知らせ
@hatebu
最新の人気エントリーの配信

キーボードショートカット一覧

j次のブックマーク

k前のブックマーク

lあとで読む

eコメント一覧を開く

oページを開く

設定を変更しましたx