サクサク読めて、アプリ限定の機能も多数!
トップへ戻る
衆院選
jakevdp.github.io
from IPython.display import Image Image('http://jakevdp.github.com/figures/xkcd_version.png') Sometimes when showing schematic plots, this is the type of figure I want to display. But drawing it by hand is a pain: I'd rather just use matplotlib. The problem is, matplotlib is a bit too precise. Attempting to duplicate this figure in matplotlib leads to something like this: It just doesn't have the
This is an excerpt from the Python Data Science Handbook by Jake VanderPlas; Jupyter notebooks are available on GitHub. The text is released under the CC-BY-NC-ND license, and code is released under the MIT license. If you find this content useful, please consider supporting the work by buying the book! This chapter has explored a number of the central concepts and algorithms of machine learning.
PdVega: Interactive Vega-Lite Plots for Pandas¶ pdvega is a library that allows you to quickly create interactive Vega-Lite plots from Pandas dataframes, using an API that is nearly identical to Pandas’ built-in plotting API, and designed for easy use within the Jupyter notebook. import pandas as pd import numpy as np data = pd.DataFrame({'x': np.random.randn(200), 'y': np.random.randn(200)}) impo
Edit 12/19/2017: added a new subsection on analyzing Chutes & Ladders as an Absorbing Markov Chain. This weekend I found myself in a particularly drawn-out game of Chutes and Ladders with my four-year-old. If you've not had the pleasure of playing it, Chutes and Ladders (also sometimes known as Snakes and Ladders) is a classic kids board game wherein players roll a six-sided die to advance forward
I've found one of the best ways to grow in my scientific coding is to spend time comparing the efficiency of various approaches to implementing particular algorithms that I find useful, in order to build an intuition of the performance of the building blocks of the scientific Python ecosystem. In this vein, today I want to take a look at an operation that is in many ways fundamental to data-driven
This website contains the full text of the Python Data Science Handbook by Jake VanderPlas; the content is available on GitHub in the form of Jupyter notebooks. The text is released under the CC-BY-NC-ND license, and code is released under the MIT license. If you find this content useful, please consider supporting the work by buying the book!
I just got home from my sixth PyCon, and it was wonderful as usual. If you weren't able to attend—or even if you were—you'll find a wealth of entertaining and informative talks on the PyCon 2017 YouTube channel. Two of my favorites this year were a complementary pair of talks on Python dictionaries by two PyCon regulars: Raymond Hettinger's Modern Python Dictionaries A confluence of a dozen great
I've spent much of the last decade using Python for my research, teaching Python tools to other scientists and developers, and developing Python tools for efficient data manipulation, scientific and statistical computation, and visualization. The Python-for-data landscape has changed immensely since I first installed NumPy and SciPy from via a flickering CRT display. Among the new developments sin
For a more up-to-date comparison of Numba and Cython, see the newer post on this subject. Often I'll tell people that I use python for computational analysis, and they look at me inquisitively. "Isn't python pretty slow?" They have a point. Python is an interpreted language, and as such cannot natively perform many operations as quickly as a compiled language such as C or Fortran. There is also th
Last week Michael Lerner posted a nice explanation of the relationship between histograms and kernel density estimation (KDE). I've made some attempts in this direction before (both in the scikit-learn documentation and in our upcoming textbook), but Michael's use of interactive javascript widgets makes the relationship extremely intuitive. I had been planning to write a similar post on the theory
Optimizing Python in the Real World: NumPy, Numba, and the NUFFT Donald Knuth famously quipped that "premature optimization is the root of all evil." The reasons are straightforward: optimized code tends to be much more difficult to read and debug than simpler implementations of the same algorithm, and optimizing too early leads to greater costs down the road. In the Python world, there is another
Matplotlib version 1.1 added some tools for creating animations which are really slick. You can find some good example animations on the matplotlib examples page. I thought I'd share here some of the things I've learned when playing around with these tools. Basic Animation The animation tools center around the matplotlib.animation.Animation base class, which provides a framework around which the a
I made a little code snippet that I find helpful, and you might too: def grayify_cmap(cmap): """Return a grayscale version of the colormap""" cmap = plt.cm.get_cmap(cmap) colors = cmap(np.arange(cmap.N)) # convert RGBA to perceived greyscale luminance # cf. http://alienryderflex.com/hsp.html RGB_weight = [0.299, 0.587, 0.114] luminance = np.sqrt(np.dot(colors[:, :3] ** 2, RGB_weight)) colors[:, :3
I've been spending a lot of time recently writing about frequentism and Bayesianism. In Frequentism and Bayesianism I: a Practical Introduction I gave an introduction to the main philosophical differences between frequentism and Bayesianism, and showed that for many common problems the two methods give basically the same point estimates. In Frequentism and Bayesianism II: When Results Differ I wen
This post is part of a 5-part series: Part I Part II Part III Part IV Part V See also Frequentism and Bayesianism: A Python-driven Primer, a peer-reviewed article partially based on this content. In Douglas Adams' classic Hitchhiker's Guide to the Galaxy, hyper-intelligent pan-dimensional beings build a computer named Deep Thought in order to calculate "the Answer to the Ultimate Question of Life,
This post is part of a 5-part series: Part I Part II Part III Part IV Part V See also Frequentism and Bayesianism: A Python-driven Primer, a peer-reviewed article partially based on this content. In a previous post I gave a brief practical introduction to frequentism and Bayesianism as they relate to the analysis of scientific data. In it, I discussed the fundamental philosophical difference betwe
We've all heard it before: Python is slow. When I teach courses on Python for scientific computing, I make this point very early in the course, and tell the students why: it boils down to Python being a dynamically typed, interpreted language, where values are stored not in dense buffers but in scattered objects. And then I talk about how to get around this by using NumPy, SciPy, and related tools
This is a bit of a niche topic, but I figured there might be one or two people out there who would find this useful (including my future self)... today I managed to implement a simple Python object which exposes the buffer protocol. If that means nothing to you, you may want to stop reading and instead browse this gallery of puppy gifs. But if you're the kind of person who becomes mildly excited a
This post is part of a 5-part series: Part I Part II Part III Part IV Part V See also Frequentism and Bayesianism: A Python-driven Primer, a peer-reviewed article partially based on this content. One of the first things a scientist hears about statistics is that there is are two different approaches: frequentism and Bayesianism. Despite their importance, many scientific researchers never have oppo
Update, March 2014: there are some major changes and refactorings in mpld3 version 0.1. Because of this, some of the code below will not work with the current release: please see the mpld3 documentation for more information. It's been a few weeks since I introduced mpld3, a toolkit for visualizing matplotlib graphics in-browser via d3, and a lot of progress has been made. I've added a lot of featu
I've spent a lot of time recently attempting to push the boundaries of tools for interactive data exploration within the IPython notebook. I have worked on animations, including an HTML5 embedding and a Javascript Viewer. I have worked on javascript/python kernel interaction and static javascript widgets. But I would say that the holy grail of interactive data visualization in the IPython notebook
The Fast Fourier Transform (FFT) is one of the most important algorithms in signal processing and data analysis. I've used it for years, but having no formal computer science background, It occurred to me this week that I've never thought to ask how the FFT computes the discrete Fourier transform so quickly. I dusted off an old algorithms book and looked into it, and enjoyed reading about the dece
One of the most consistently popular posts on this blog has been my XKCDify post, where I followed in the footsteps of others to write a little hack for xkcd-style plotting in matplotlib. In it, I mentioned the Sketch Path Filter pull request that would eventually supersede my ugly little hack. Well, "eventually" has finally come. Observe: inline Welcome to pylab, a matplotlib-based Python environ
Last summer I wrote a post comparing the performance of Numba and Cython for optimizing array-based computation. Since posting, the page has received thousands of hits, and resulted in a number of interesting discussions. But in the meantime, the Numba package has come a long way both in its interface and its performance. Here I want to revisit those timing comparisons with a more recent Numba rel
Last week, Fernando Perez visited UW to give a talk for the eScience institute. Over lunch we were discussing the possibility of building a Javascript-based animation viewer which could be embedded in IPython notebooks. I had written a short hack to embed mp4 movies in IPython, which works quite well: Michael Kuhlen of Berkeley ran with the idea and made this notebook, which embeds a 3D rendering
I recently submitted a scikit-learn pull request containing a brand new ball tree and kd-tree for fast nearest neighbor searches in python. In this post I want to highlight some of the features of the new ball tree and kd-tree code that's part of this pull request, compare it to what's available in the scipy.spatial.cKDTree implementation, and run a few benchmarks showing the performance of these
次のページ
このページを最初にブックマークしてみませんか?
『Home | Pythonic Perambulations』の新着エントリーを見る
j次のブックマーク
k前のブックマーク
lあとで読む
eコメント一覧を開く
oページを開く