Swimming upstream on the technology tide, one technology at a time. A collection of articles, tips, and random musings on application development and system design. I've been using Jericho to parse HTML for a while now. I mainly use it to extract pieces of text from specific locations in the HTML. To do this, I use the Jericho API - I have factored out the boilerplate code associated with XML/HTML