サクサク読めて、アプリ限定の機能も多数!
トップへ戻る
参議院選挙2025
use.perl.org/~miyagawa
Anyone using DBD::SQLite and $dbh->{unicode} attribute set?It has a long standing bug where it assumes passed strings internal encoding is UTF-8 when inserting values into the database.http://svn.ali.as/cpan/trunk/DBD-SQLite/t/rt_25371_asymmetric_unicode.t is a failing test by Juerd and http://gist.github.com/90590 is my patch to fix that. This patch still passes all tests, including 12_unicode.t
Shibuya.pm will have its 9th technical meeting and the topic of the meeting is XS. No, I'm not joking and all the talks are somehow about XSUB stuff. Let me quote some talks:1. My First XS (hirose31) 2. Welcome to Perl5 Internals (Daisuke Maki) 3. Inside Ruby.pm (Goro Fuji) 4. PerlMachine (wakapon)PerlMachine is a crazy project that is a minimal linux kernel that is designed solely to run perl. He
YAPC::Asia has become huge. This year we've got 550 registrations and I think this is one of the biggest YAPCs ever.However, our organization team has been getting smaller year by year, maybe because we knew we can do this. I live in San Francisco, USA and remotely organize the conference for this 2 years, just like any other project managers do for a project. That means, the real tough work has b
YAPC::Asia 2008 organizers would like to thank Eric Cholet, the author of ACT for the great conference organizing software that powers most of YAPCs and Perl Workshops.To show the appreciation in the hacker's way, I'm flying to Paris, France next weekend (April 25-28) funded by YAPC::Asia possible profit, to work on Act feature enhancement.We plan to work on these things because we want them for Y
(Editorial: Don't frontpage this post, editors. I write it down here to summarize my thought, wanting to get feedbacks from my trusted readers and NOT flame wars or another giant thread of utf-8 flag woes)I can finally say I fully grok Unicode, UTF-8 flag and all that stuff in Perl just lately. Here are some analysis of how perl programmers understand Unicode and UTF-8 flag stuff.(This post might
UPDATE: The module was originally written using constant overloading, but it is a dangerous and gross hack, so I changed that to use autobox framework instead (wondering why I didn't try that at first!). I updated the post accordingly.Rails has ActiveSupport, something to add funky methods to Ruby core object, to do fancy things like 2.months.ago to get Time duration object etc.I found it pretty i
Today I had an interesting report from Web::Scraper user, saying that he has a script that runs really quick (less than 1 sec) on Macbook but so slow (50 secs) on AMD dual CPU machine. Here's the dprof report: Total Elapsed Time = 47.32165 Seconds User+System Time = 31.07165 Seconds Exclusive Times %Time ExclSec CumulS #Calls sec/call Csec/c Name 51.6 16.03 16.033 6922 0.0023 0.0023 XML::XPathEn
#!/usr/bin/perl use strict; use Web::Scraper; use URI; my $uri = URI->new("http://wikisubtitles.net/ajax_loadShow.php?show=65&season=3"); my $scraper = scraper { process '//td[@class="idioma"][text()=~"English \(US\)"]/..//a', 'links[]' => '@href'; }; my $result = $scraper->scrape($uri);
Question: Is it possible to annotate/tag each CPAN module update so that we can figure out if the update contains "security fix", "minor bug fix" or "major API change" etc.?Context: At work we have a repository of third party CPAN modules that we use on Vox or TypePad. Once a module is added to the list, we manually follow the changes of each module to figure out if we need to upgrade (ala fix for
Some websites require you to login to the site using your credential, to view the content. It's easily scriptable with WWW::Mechanize, but if you visit the site frequently with your browser, why not reusing the browser's cookies, so as you don't need to script the login process?Web::Scraper allows you to call methods, or entirely swap its UserAgent object when it scrapes the website. Here's how to
search.cpan.org has an RSS feed for recently uploaded modules but there's only one minor problem: the feed doesn't have rich metadata.Daisuke Murase (aka typester on CPAN and IRC) created a site called CPAN Recent Changes a while ago and it's been really useful for people tracking activities on CPAN.The feature the site provides is very simple: "a better recent change log for CPAN". The site track
Lazyweb,Is there a module to debug your regular expression, to compare the target string and an input regular expression one byte by one? It'd be useful if you have an existent code to do a pattern match against a big chunk of string and don't know why it doesn't match. use Regexp::Debug; my $string = "abcdefg"; my $regexp = qr/abcefg/; # Notice 'd' is missing my $result = Regexp::Debug->compare
Web::Scraper with filters, and thought about Text filters A developer release of Web::Scraper is pushed to CPAN, with "filters" support. Let me explain how this filters stuff is useful for a bit.Since an early version, Web::Scraper has been having a callback mechanism which is pretty neat, so you can extract "data" out of HTML, not limited to the string.For instance, if you have an HTML
On this monday (in Japan time) we had Shibuya.pm Tech Talks #8 and we live-streamed and recorded most talks on ustream.tv under shibuya.pm tag. This ustream.tv listing works great but we want to make the Flash video files available for download, possibly as a videocast (RSS 2.0) feed so you can subscribe to using offline video player like Miro or iTunes.So we were chatting on #plagger-ja IRC chann
Shibuya Perl Mongers tech talks #8 was very successful and there were lots of fun talks like gugod's JavaScript::Writer, takesako's image hack to detect browser's img tag bugs etc.My slides are available on slideshare.net as always, along with other Shibuya.pm slides. Videos were recorded on ustream and available via shibuya.pm tag thanks to Yappo and otsune.
While I'm in Japan for two weeks, Shibuya Perl Mongers folks kindly setup another technical meeting on 10/1 night.See the details of the meeting but we're gonna have lots of interesting talks, including my re-doing YAPC::EU's well accepted Web::Scraper talk and gugod's talk about JavaScript::Writer, since he's visiting Japan from Taiwan at the same time as I'm here.Shibuya.pm tech talk event gets
Web::Scraper 0.15 is pushed and it's going to CPAN. The major enhancement made on this release is to add the ability to deal with UserAgent object used to retrieve the content.Now you can deal with $Web::Scraper::UserAgent or $scraper->user_agent accessor to call various methods e.g. like agent() or env_proxy() and change the object itself to something else, like LWPx::ParanoidAgent.Wait for CPAN
Web::Scraper 0.14 is released along with a couple of neat features.First of all, I incorpolated HTML::Tagset's linkElements hash into '@attr' accessor of elements, so if you do this: $s = scraper { process "a", "links[]" => '@href' }; $s->scrape(URI->new("http://www.example.com/")); because a@href is known to be link elements, they're automatically converted to absoltue URI using http://www.exampl
This is inspired by an email from Renée Bäcker asking how to get content inside javascript tag. Because Web::Scraper's 'TEXT' mapping calls as_text method of HTML::Element, it doesn't get the content inside script and style tag. Here's the code that works. It's kinda clumsy, and it'd be nice if there's much cleaner way to do this: #!/usr/bin/perl # extract Javascript code into 'code' use strict; u
I'm trying to put some neat cookbook things using Web::Scraper on this journal. They'll eventually be incoropolated into the module document like Web::Scraper::Cookbook, but I'll post here for now since it's easy to update and give a permalink to.The easiest way to keep up with these hacks would be to subscribe to the RSS feed of this journal, or look at my del.icio.us links tagged 'webscraper' (w
Modules like HTML::TreeBuilder::XPath and HTML::Selector::XPath is very useful to extract content from HTML DOM tree using XPath expressions or CSS selectors. These modules do the following:HTML DOM Tree + XPath expression => The element you wantIs there an other way round to do this? I mean,HTML DOM Tree + The element you want => XPath expressionI know Mozilla extension allows to do this with GUI
My employer Six Apart http://www.sixapart.com/, which is a really cool company and a fun place to work, is now looking for a full-time, on-site (San Francisco, CA) software engineer for our newest blogging platform Vox (http://www.vox.com/).Required skills are: excellent Perl, Database (MySQL) and preferrably some UI skills like JavaScript and HTML. If you're familiar with webapp frameworks like C
Web::Scraper is released, the Perl port of Scrapi.rb Today I've been thinking about what to talk in YAPC::EU (and OSCON if they're short of Perl talks, I'm not sure), and came up with a few hours of hacking with web-content scraping module using Domain Specific Languages.With help from guys on IRC channel and obra who gave a nice talk about DSL in Perl at YAPC::Asia, I whipped up a really small We
Investigating the UTF-8 on/off performance issues for work. Template::Stash::ForceUTF8 does utf8_on calls to all stash variables, which could be crucial to perfomance to the large-scale site like ours.Looking through TT source code, I came up with the simple hack to add 'use encoding "utf-8"' to the generated code by TT. This solves the utf-8 bytes and string concatination (auto-upgrades from lati
I want a name for my new module, that automatically detects the best, conservative encodings to be used in Email messages, from the strings.It'll be useful to encode email message in iso-2022-jp if all content are in Japanese, iso-2022-kr for Korean etc. Gmail does it by default: http://mail.google.com/support/bin/answer.py?ctx=gmail&hl=en&answer=22841I'm thinking of Encode::Email::Best and Encode
Data::ObjectDriver, yet another ORM we are using in Six Apart, now uses Class::Trigger and our develoeprs team in Vox and Movable Type have reported the bug in the module where triggers added to the superclass *after* triggers are added to the child class are ignored in the child classes.I fixed the module by applying the patch from Brad Choate and uploaded the new version 0.10_01 on CPAN. Meanwhi
Top 10Friend's Journals Sorry, the requested journal entries were not found. if (instr(buf,sys_errlist[errno])) /* you don't see this */ --Larry Wall in eval.c from the perl source code
Here's the latest module of mine, a spin-off from Plagger project: XML::OPML::LibXML. It's now going to CPAN and is available at my SVN repo http://svn.bulknews.net/repos/public/XML-OPML-LibXML/trunk/There's already a similar module XML::OPML on CPAN, but we don't like it because it doesn't have a decent API to access parsed OPML data (while it has a good API to create new OPML data programaticall
http://svn.bulknews.net/repos/public/HTTP-Response-Charset/trunk/So I created a module HTTP::Response::Charset, which detects a charset of HTTP response using various techniques (Content-Type, META tag, BOM, XML declaration and Encode::Detect). The motivation is to get correctly decoded Unicode string from any HTTP response, especially text/html, text/plain, XHTML and RSS/Atom.The POD document has
次のページ
このページを最初にブックマークしてみませんか?
『use.perl.org』の新着エントリーを見る
j次のブックマーク
k前のブックマーク
lあとで読む
eコメント一覧を開く
oページを開く