There have been many Python libraries developed for interacting with the Hadoop File System, HDFS, via its WebHDFS gateway as well as its native Protocol Buffers-based RPC interface. I'll give you an overview of what's out there and show some engineering I've been doing to offer a high performance HDFS interface within the developing Arrow ecosystem. This blog is a follow up to my 2017 Roadmap pos