So I wrote a program to guess how much is paper:
and how much is drawing:
Wow! I'm surprised it worked! I barely had to change my initial guess at an algorithm; mainly, I just converted previously integer arithmetic to floating point so I could mess around with tiny timesteps . . . speaking of which, yeah, how does it work?
This way: to find out how the image looks like erased, we physically simulate a sheet of cloth, which "falls down" on the graph of the image, where white is high and black, low. Tension in the fabric will prevent it from filling black valleys, so we'll get some surface that fits neatly on the highest parts of the image, but that remains far above low parts. In other words, we will have whitened out the black areas in a smooth way consistent with the bad lighting of the white areas. The fabric is then a graph of the paper beneath the drawing!
In particular, we want to minimize the energy $V:\text{bitmaps} \to \mathbb{R}$: $$\text{to do} = \text{fall down} + \text{catch on original} + \text{don't warp}$$ $$V(p) = m p + k \min\{0, \, (p-\text{original})\}^2 + s (\nabla^2 p)^2$$ and so descend by gradient $$\text{paper} \leftarrow \text{paper} - \epsilon \nabla V(\text{paper})$$ as the following slides show: