[B! pdf] xefのブックマーク

You can deeplink to a specific PDF page

xef 2024/07/12

#page=Xで特定ページへのリンクが貼れる

PDF

リンク

GitHub - vslavik/diff-pdf: A simple tool for visually comparing two PDF files

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert

xef 2024/07/07

pdf
diff

リンク

CVE-2024-4367 - Arbitrary JavaScript execution in PDF.js — Codean Labs

This post details CVE-2024-4367, a vulnerability in PDF.js found by Codean Labs. PDF.js is a JavaScript-based PDF viewer maintained by Mozilla. This bug allows an attacker to execute arbitrary JavaScript code as soon as a malicious PDF file is opened. This affects all Firefox users (<126) because PDF.js is used by Firefox to show PDF files, but also seriously impacts many web- and Electron-based a

xef 2024/05/21

Security
PDF

リンク

Insecure Features in PDFs

In 2019, we published attacks on PDF Signatures and PDF Encryption. During our research and studying the related work, we discovered a lot of blog posts, talks, and papers focusing on malicious PDFs causing some damage. However, there was no systematic analysis of all possible dangerous features supported by PDFs, but only isolated exploits and attack concepts. We decided to fill this gap and syst

xef 2024/02/26

PDF
Security

リンク

1.5+ million PDFs in 25 minutes - Zerodha Tech Blog

xef 2024/02/16

リンク

Making a PDF that’s larger than Germany

about articles today i learned contact Making a PDF that’s larger than Germany Posted 31 January 2024 Tagged with code-crimes, drawing-things I was browsing social media this morning, and I saw a claim I’ve seen go past a few times now – that there’s a maximum size for a PDF document: Some version of this has been floating around the Internet since 2007, probably earlier. This tweet is pretty em bl

xef 2024/02/01

PDF

リンク

文書配付機能でPDFレンダリングのライブラリを置き換えた話 - SmartHR Tech Blog

こんにちは！SmartHRで文書配付機能の開発を担当している、aanzaiです。 2022年末から2023年2月にかけて、文書配付機能で使用しているPDFのレンダリングライブラリの置き換えを行ったため、具体的にどのように移行したかをご紹介します。文書配付機能の紹介文書配付機能（旧:雇用契約）は、SmartHRの最初のオプション機能として開発された機能で、事前に作成した書類テンプレートをもとに、SmartHRに保存された従業員情報を差し込んで書類PDFを作成し、従業員に配付したり、契約書として合意を取ったりすることができる機能です。書類テンプレートのレイアウトは、ユーザーがWYSIWYGエディタで作成したものがHTMLとして保存されています。書類を配付する際は、このレイアウトHTMLに従業員情報を差し込み、PDFに変換します。 PDFレンダリングライブラリ移行の理由文書配付機能では、

xef 2023/07/04

PDF

リンク

GitHub - ahrm/sioyek: Sioyek is a PDF viewer with a focus on textbooks and research papers

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert

xef 2022/09/04

PDF

リンク

Let's write a PDF file

A simple walk-through to learn the basics of the PDF format (at your rhythm) 2017/07/23 : first release 2015/07/28 : r2 - typos, improvements, stream filters

xef 2022/05/21

PDF

リンク

Pythonを用いたPDFデータからの情報抽出 / Extraction data from PDF using Python

■イベント  ：第54回情報科学若手の会 https://wakate.connpass.com/event/222829/ ■登壇概要タイトル：Pythonを用いたPDFデータからの情報抽出 / Extraction data from PDF using Python 発表者：  技術…

xef 2021/09/30

PDF
Python

リンク

PDFをコピペするとなぜ“文字化け”が起きてしまうのか　変換テーブル“ToUnicode CMap”が原因だった

NTT Tech Conferenceは、NTTグループのエンジニアたちが一堂に会し、NTTグループ内外のエンジニアたちと技術交流を行うためのカンファレンスです。ここで、細田氏が「PDFのコピペが文字化けするのはなぜか？〜CID/GIDと原ノ味フォント〜」をテーマに話します。まずは文字化けが起こってしまう原因について。原ノ味フォントの作成者細田真道氏（以下、細田）：細田です。ふだんはNTTグループのどこかでDXな仕事をしていますが、今日はぜんぜん仕事とは関係なく、個人的にやっているオープンソースなどの話をしたいと思います。よろしくお願いします。簡単に自己紹介をします。楽譜を作成するプログラム「LilyPond」のコミッターと、GNUの公式文書フォーマット「Texinfo」のコミッターをしています。あとで話しますが、「原ノ味フォント」を作っていて、すごく似たような名前で「原ノ町」という

xef 2021/05/27

PDF

リンク

GitHub - trueroad/tr-NTTtech05: NTT Tech Conference #5 Presentation 「PDFのコピペが文字化けするのはなぜか？～CID/GIDと原ノ味フォント～」関連資料

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert

xef 2021/02/28

PDF

リンク

プログラマーから見たPDFファイル - アンテナハウス PDF資料室

更新日: 2020年8月14日このページの目的プログラマーは、クライアントから提供されたPDFファイルで、その要求を実現させようとしたとき、PDFのどんなところを見ているのでしょうか。このページでは、ちょっと珍しい視点でPDFファイルを解き明かしていきます。自分でプログラムを書いてPDFファイルからテキストデータを取り出したいという人も、ぜひご一読ください。はじめに PDFファイルをクリックすると、あたかも紙に印刷したかのように、どんなマシンでも同じような見た目で文章や画像がディスプレイに表示されます。この単純な事実は、日常的にPDFファイルを利用していると当たり前に感じられるかもしれません。しかし、よくよく考えると驚くべきことです。いったい、どのような仕組みがあれば、「過去から現在に至るさまざまな種類のコンピューターで見た目を変えずに同一の紙面を再現する」という目的を達成でき

xef 2020/08/15

PDF

リンク

GitHub - dsanson/termpdf.py: A graphical pdf and epub reader that works inside the kitty terminal

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert

xef 2020/08/07

PDF
Python

リンク

PythonでPDFファイルのテーブルデータを読み取る - Qiita

PDFデータ世の中の人はPDFが大好きなようで、嫌い嫌いと言っていても扱わざるを得ません。しかし、それに何時間もかけるのはちょっと・・・と思うのが人の常です。PDFの表データをしかないというケースもありますが、そのような際に便利なtabula-pyという超便利なライブラリがあったのでメモしておきます。 https://github.com/chezou/tabula-py tabulaにかんして tabulaはPDFの表を抽出するためのJavaのライブラリです。tabula-pyはそのラッパとなっております。そのため、利用するためにはJavaのインストールが必要です。 Javaをインストールした後、下のようにするとPythonのライブラリが利用できます。

xef 2020/04/06

Python
PDF

リンク

GitHub - J-F-Liu/lopdf: A Rust library for PDF document manipulation.

use lopdf::dictionary; use lopdf::{Document, Object, Stream}; use lopdf::content::{Content, Operation}; // with_version specifes the PDF version this document complies with. let mut doc = Document::with_version("1.5"); // Object IDs are used for cross referencing in PDF documents. `lopdf` helps keep track of them // for us. They are simple integers. // Calls to `doc.new_object_id` and `doc.add_obj

xef 2019/11/18

Rust
PDF

リンク

GitHub - caradoc-org/caradoc: A PDF parser and validator

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert

xef 2019/11/16

リンク

GitHub - oxplot/pdftilecut: pdftilecut lets you sub-divide a PDF page(s) into smaller pages so you can print them on small form printers.

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert

xef 2018/12/01

PDF
golang

リンク

Announcing Camelot, a Python Library to Extract Tabular Data from PDFs - Atlan | Humans of Data

Announcing Camelot, a Python Library to Extract Tabular Data from PDFs By Vinayak Mehta October 3, 2018 We use GitHub issues to keep track of all issues. Please do not report bugs or issues in this blog’s comments. Instead, post them on GitHub as an issue. Before submitting a comment with an issue, please use GitHub search to look for existing issues (both open and closed) that may be similar. The