A Bloom filter is a probabalistic data structure which provides an extremely space-efficient method of representing large sets. It can have false positives but never false negatives which means a query returns either "possibly in set" or "definitely not in set". You can tune a Bloom filter to the desired error rate, it's basically a tradeoff between size and accuracy (See example: http://hur.st/bl

