In this post I recover a Bitcoin passphrase by performing a Breadth-First search on typos of increasing Damerau-Levenshtein distances from an initial guess. This order was chosen because I show that Damerau-Levenshtein distances in typos follow a Zipf distribution, so the search should converge faster. This improves on (but does not replace) BTCRecover, which has a limited definition of a typo. The code is simple to read and modify, and is available on GitHub.

## Succinct de Bruijn Graphs

This post will give a brief explanation of a Succinct implementation for storing de Bruijn graphs, which is recent (and continuing) work I have been doing with Sadakane. Using our new structure, we have squeezed a graph for a human genome (which took around 300 GB of memory if using previous representations) down into 2.5 […]

## Failing at Google Interviews

I’ve participated in about four sets of Google interviews (of about 3 interviews each) for various positions. I’m still not a Googler though, which I guess indicates that I’m not the best person to give this advice. However, I think it’s about time I put in my 0.002372757892 bitcoins. I recently did exactly this to […]

## FM-Indexes and Backwards Search

Last time (way back in June! I have got to start blogging consistently again) I discussed a gorgeous data structure called the Wavelet Tree. When a Wavelet Tree is stored using RRR sequences, it can answer rank and select operations in $\mathcal{O}(\log{A})$ time, where A is the size of the alphabet. If the size of […]

## Wavelet Trees – an Introduction

Today I will talk about an elegant way of answering rank queries on sequences over larger alphabets – a structure called the Wavelet Tree. In my last post I introduced a data structure called RRR, which is used to quickly answer rank queries on binary sequences, and provide implicit compression. A Wavelet Tree organises a […]

## RRR – A Succinct Rank/Select Index for Bit Vectors

This blog post will give an overview of a static bitsequence data structure known as RRR, which answers arbitrary length rank queries in $\mathcal{O}(1)$ time, and provides implicit compression. As my blog is informal, I give an introduction to this structure from a birds eye view. If you want, read my thesis for a version […]