| Andrew Cooke | Contents | Latest | RSS | Twitter | Previous | Next


Welcome to my blog, which was once a mailing list of the same name and is still generated by mail. Please reply via the "comment" links.

Always interested in offers/projects/new ideas. Eclectic experience in fields like: numerical computing; Python web; Java enterprise; functional languages; GPGPU; SQL databases; etc. Based in Santiago, Chile; telecommute worldwide. CV; email.

Personal Projects

Lepl parser for Python.

Colorless Green.

Photography around Santiago.

SVG experiment.

Professional Portfolio

Calibration of seismometers.

Data access via web services.

Cache rewrite.

Extending OpenSSH.

C-ORM: docs, API.

Last 100 entries

Culturally Liberal and Nothing More; Weird Finite / Infinite Result; Your diamond is a beaten up mess; Maths Books; Good Bike Route from Providencia / Las Condes to Panul\; Iain Pears (Author of Complex Plots); Plum Jam; Excellent; More Recently; For a moment I forgot StackOverflow sucked; A Few Weeks On...; Chilean Book Recommendations; How To Write Shared Libraries; Jenny Erpenbeck (Author); Dijkstra, Coins, Tables; Python libraries error on OpenSuse; Deserving Trump; And Smugness; McCloskey Economics Trilogy; cmocka - Mocks for C; Concept Creep (Americans); Futhark - OpenCL Language; Moved / Gone; Fan and USB issues; Burgers in Santiago; The Origin of Icosahedral Symmetry in Viruses; autoenum on PyPI; Jars Explains; Tomato Chutney v3; REST; US Elections and Gender: 24 Point Swing; PPPoE on OpenSuse Leap 42.1; SuperMicro X10SDV-TLN4F/F with Opensuse Leap 42.1; Big Data AI Could Be Very Bad Indeed....; Cornering; Postcapitalism (Paul Mason); Black Science Fiction; Git is not a CDN; Mining of Massive Data Sets; Rachel Kaadzi Ghansah; How great republics meet their end; Raspberry, Strawberry and Banana Jam; Interesting Dead Areas of Math; Later Taste; For Sale; Death By Bean; It's Good!; Tomato Chutney v2; Time ATAC MX 2 Pedals - First Impressions; Online Chilean Crafts; Intellectual Variety; Taste + Texture; Time Invariance and Gauge Symmetry; Jodorowsky; Tomato Chutney; Analysis of Support for Trump; Indian SF; TP-Link TL-WR841N DNS TCP Bug; TP-Link TL-WR841N as Wireless Bridge; Sending Email On Time; Maybe run a command; Sterile Neutrinos; Strawberry and Banana Jam; The Best Of All Possible Worlds; Kenzaburo Oe: The Changeling; Peach Jam; Taste Test; Strawberry and Raspberry Jam; flac to mp3 on OpenSuse 42.1; Also, Sebald; Kenzaburo Oe Interview; Otake (Kitani Minoru) move Black 121; Is free speech in British universities under threat?; I am actually good at computers; Was This Mansplaining?; WebFaction / LetsEncrypt / General Disappointment; Sensible Philosophy of Science; George Ellis; Misplaced Intuition and Online Communities; More Reading About Japan; Visibilty / Public Comments / Domestic Violence; Ferias de Santiago; More (Clearly Deliberate); Deleted Obit Post; And then a 50 yo male posts this...; We Have Both Kinds Of Contributors; Free Springer Books; Books on Religion; Books on Linguistics; Palestinan Electronica; Books In Anthropology; Taylor Expansions of Spacetime; Info on Juniper; Efficient Stream Processing; The Moral Character of Crypto; Hearing Aid Info; Small Success With Go!; Re: Quick message - This link is broken; Adding Reverb To The Echo Chamber; Sox Audio Tools; Would This Have Been OK?

© 2006-2015 Andrew Cooke (site) / post authors (content).

Why and How Writing Crypto is Hard

From: andrew cooke <andrew@...>

Date: Tue, 25 Dec 2012 18:56:30 -0300

Over the last few days I wrote a simple library to encrypt data in Python.
This blog post describes my experience writing that code.  I focus on the
various mistakes I made, and try to understand the underlying causes.

But first a little context.  I'm aware of the phrase (exhortation? slogan?)
"Typing The Letters A-E-S Into Your Code? You’re Doing It Wrong"
but I couldn't find a Python 3 library that let me encrypt a string using a
simple password.

So I decided to go ahead, write the code, and then solicit feedback.  If I
had made any mistakes then perhaps someone else would correct me, and the
result would be something other people could use.

To be honest, when I started, I thought could do a pretty good job.  I've
worked with security-related code several times (a JNI wrapper for OpenSSL
back in the day; more recently, for example, making OpenSSH talk to hardware
key stores) and I thought a fair amount of crypto knowledge had "rubbed off" -
I can explain what CTR mode is, for example, and why you should never use the
same key+IV twice.  And also, I am not so dumb; how hard can this stuff be?

Even so, I searched around for some guidance on best practice.  And I was
lucky enough to stumble across
which I decided to follow.

My first attempt was broken (although I eventually found the mistake myself).
It had exactly the vulnerability I said I could explain above: messages with
the same key used the same counter sequence.  This was because the "iv"
parameter in the pycrypto Cipher API is ignored in CTR mode.  Instead, you
need to provide the data to the Counter object.

I don't know if I am being muddle-headed in thinking of the initial counter
value as an IV, but I was a little annoyed with pycrypto.  Couldn't it throw
an error if it's given an IV in CTR mode, instead of simply ignoring it?  On
the other hand it doesn't seem fair to expect a library of crypto primitives
to educate users - it's intended for experts, who should know what they are

Anyway, that was my first mistake.  The root cause being, I think, that crypto
APIs are complex because they provide access to powerful primitives that can
be combined in many ways, but which, at the same time, must also be efficient
(the need for efficiency affects the design of Counter, for example, which is
why the IV is ignored).  A box of sharp tools.

Next, I started to worry about the API for *my* users.  I couldn't really
expect them to provide a 256 bit key; this was a library for "anyone".  So
it had to take something more like a password.

Unfortunately, although I knew about key derivation functions, which is what
you need to go from password to key, I thought they were used only for
storing passwords.  I have no idea why I thought this, but as a consequence
I started to cobble together my own hand-rolled attempt at key strengthening.

Thankfully, as my code got more complex, I realised I must be reinventing an
already-existing technique.  Once I was convinced of that, finding PBKDF2 (it
was mentioned in the link I said I would follow - although nowhere near the
paragraph on symmetric ciphers) was easy.

So mistake 2 (which I eventually avoided) was not knowing about an existing
solution to a common problem.  Or rather, not knowing that it could be
applied in a more general sense than I had understood.

At this point I believed my code was pretty solid so I posted it to HN at

It took a while to get useful feedback, but when I did, it was awesome.  So
awesome it identified FIVE more problems.  Ouch.

  1.  Don't expose salt in the API.
  2.  Use separate keys for cipher and HMAC.
  3.  Avoid a possible timing attack when comparing HMACs.
  4.  Manage the counter in a standard (NIST) way.
  5.  PBKDF was using a weaker hash than expected.

The first (user gives salt) is plain embarrassing - it's just bad API design.
If I can blame anything other than incompetence, salt appeared in the original
API because it "seemed odd" to generate data and then append it to the
message.  I felt that even though it is how you handle the IV (and, in fact,
the final code uses the same data for both salt and IV).  So it's not a
particularly logical explanation for my mistake, but it's all I have.

The second (separate keys) was an open question - I just didn't know what
best practice was.  So lack of experience there.

The third (timing attack) was a subtle implementation detail I would never
have noticed.  A lack of knowledge of the current literature.

Fourth (counter management) was more damning.  I already knew the
normal way to handle counters, from using CTR mode to generate a stream
of random data in another project.  I thought I was being smart and improving
things by using a different approach (yes, I know that sounds like the kind
of thing a newbie would say, but I thought it *despite knowing that*).

Fifth (weaker hash) I blame partly on the pycrypto API (again) (the way that
the hash is exposed is rather obscure), but also on a lack of familiarity
with key derivation standards - I didn't know that the MAC was a likely

So, in one simple piece of crypto code I had a total of seven errors (so far).
The sources of error were:

  *  Being unaware of existing solutions to common problems.
  *  Being unaware of existing best practices.
  *  Misunderstanding the complex API of a crypto toolkit.
  *  Bad API design.
  *  Ignoring existing solutions and "improving" things.

The last of these I can't do much about.  In theory I should be smart enough
to not do that.  I guess the lesson there is that sometimes you make even
dumber mistakes than you expect.

The rest divide nicely into two groups: experience and API design.

I was surprised how important experience was.  Despite having some experience
with security-related code.  Despite having a good set of guidelines on what
to do.  Despite being able to search the Internet.  Despite all that, I still
made mistakes that only experience could spot.

As for API design.  Well, I think that just confirms how important (and hard,
and overlooked) API design is.

So, what are the conclusions?  Experience and API design matter.  And even
when you are aware of the kind of pitfalls that face people that write crypto
code, you can still make dumb mistakes.


PS The current library is at https://github.com/andrewcooke/simple-crypt

I can relate to that ...

From: Michiel Buddingh' <michiel@...>

Date: Thu, 27 Dec 2012 07:16:34 +0100

. . . I recently wrote some cryptographic code that encrypted some
very short (10-20 byte) messages.  There was a requirement that we'd
be able to decrypt any of these messages individually, without having
access to the other messages.

And so, I recycled the iv, and I didn't even bother with key
strengthening, knowing well that whoever reads this code in ten years
is going to think me an idiot.  But of course, 1) I really couldn't
justify the time to do it properly 2) we were just trying to
discourage onlookers, not thwart the NSA.

What still bothers me about that situation, though, is that, for all I
know, recycling the iv is the worst compromise to make; there might be
cleverer ways to accomplish what I was trying to do.

 . . . the thing is, the cryptography sector doesn't "do" trade-offs;
your security is either resilient to a government agency running a
chosen-plaintext attack on their FPGA cluster, or it's considered
embarassingly broken.

The very people who do have the capability to write high-level APIs,
to make sensible trade offs in designing algorithms and approaches to
security problems also have a, seemingly cultural, inhibition against


Re: I can relate to that ...

From: andrew cooke <andrew@...>

Date: Thu, 27 Dec 2012 08:55:14 -0300

Space constraints are difficult.  At work they were trying to encryot the body
of SMS.  I am not sure what happened in the end, but it wasn't looking good.

When it comes to "make it hard, but don't worry if it's not impossible" I feel
like there should be some kind of standard.  Perhaps there is, and it is
ROT13.  And maybe just suggesting that can help, because when people start to
object to ROT13 the same arguments typically apply to anything else that isn't
"proof against government".

Anyway, I just want to emphasise that I fixed all the bugs I discussed, and
simple-crypt, which is now on PyPi http://pypi.python.org/pypi/simple-crypt is
supposed to be able to "thwart the NSA".  Of course, it may still contain bugs
(which is why it is (1) in beta and (2) includes a header in the encrypted
data that will allow a fixed version to be deployed and work even when people
have used a previous, buggy version, should it be needed).


Fixing this

From: Laurens Van Houtven <_@...>

Date: Sun, 11 Aug 2013 10:32:51 +0200

Hi Andrew,

Excellent points, and I agree wholeheartedly.

For the library situation, I've joined some people in writing a library:

Right now, it's mostly just primitives, but the end goal is an API that you
simply couldn't get wrong, which sounds to me like what you wanted in the
first place.

Additionally, I agree that education is lacking. Hence, I'm busy turning my
talk from last year, Crypto 101 (http://pyvideo.org/video/1778/crypto-101)
into a book. Hopefully this will make the journey for future programmers a
little easier :)

I eludicated further in a HN comment:


Re: Why and How Writing Crypto is Hard

From: Teddy Hogeborn <teddy@...>

Date: Sun, 11 Aug 2013 16:24:52 +0200

> but I couldn't find a Python 3 library that let me encrypt a string
> using a simple password.

Well, use GPG for data at rest. You could just simply call GPG on the
command line.  Here's a class I wrote to do just that:

import subprocess
import binascii
import tempfile

class PGPError(Exception):
    """Exception if encryption/decryption fails"""

class PGPEngine(object):
    """A simple class for OpenPGP symmetric encryption & decryption

    with PGPEngine() as pgp:
        password = "password"
        data = "plaintext data"
        crypto = pgp.encrypt(data, password)
        decrypted = pgp.decrypt(crypto, password)
    def __init__(self):
        self.tempdir = tempfile.mkdtemp()
        self.gnupgargs = ['--batch',
                          '--home', self.tempdir,
    def __enter__(self):
        return self
    def __exit__(self, exc_type, exc_value, traceback):
        return False
    def __del__(self):
    def _cleanup(self):
        if self.tempdir is not None:
            # Delete contents of tempdir
            for root, dirs, files in os.walk(self.tempdir,
                                             topdown = False):
                for filename in files:
                    os.remove(os.path.join(root, filename))
                for dirname in dirs:
                    os.rmdir(os.path.join(root, dirname))
            # Remove tempdir
            self.tempdir = None
    def password_encode(self, password):
        # Passphrase can not be empty and can not contain newlines or
        # NUL bytes.  So we prefix it and hex encode it.
        return b"foo" + binascii.hexlify(password)
    def encrypt(self, data, password):
        passphrase = self.password_encode(password)
        with tempfile.NamedTemporaryFile(dir=self.tempdir
                                         ) as passfile:
            proc = subprocess.Popen(['gpg', '--symmetric',
                                    + self.gnupgargs,
                                    stdin = subprocess.PIPE,
                                    stdout = subprocess.PIPE,
                                    stderr = subprocess.PIPE)
            ciphertext, err = proc.communicate(input = data)
        if proc.returncode != 0:
            raise PGPError(err)
        return ciphertext
    def decrypt(self, data, password):
        passphrase = self.password_encode(password)
        with tempfile.NamedTemporaryFile(dir = self.tempdir
                                         ) as passfile:
            proc = subprocess.Popen(['gpg', '--decrypt',
                                    + self.gnupgargs,
                                    stdin = subprocess.PIPE,
                                    stdout = subprocess.PIPE,
                                    stderr = subprocess.PIPE)
            decrypted_plaintext, err = proc.communicate(input
                                                        = data)
        if proc.returncode != 0:
            raise PGPError(err)
        return decrypted_plaintext

/Teddy Hogeborn

The Mandos Project

Comment on this post