Wednesday, November 11, 2015

Using Let's Encrypt to Secure an Elastic Beanstalk Website

Since I've been pushing the library and academic publishing community to implement HTTPS on all their informations services, I was really curious to see how the new Let's Encrypt (LE) certificate authority is really working, with its "general availability" date imminent. My conclusion is that "general availability" will not mean "general usability" right away; its huge impact will take six months to a year to arrive. For now, it's really important for the community to put our developers to work on integrating Let's Encrypt into our digital infrastructure.

I decided to secure the website as my test example. It's still being developed, and it's not quite ready for use, so if I screwed up it would be no disaster. is hosted using Elastic Beanstalk (EB) on Amazon Web Services (AWS), which is a popular and modern way to build scaleable web services. The servers that Elastic Beanstalk spins up have to be completely configured in advance- you can't just log in and write some files. And EB does its best to keep servers serving. It's no small matter to shut down a server and run some temporary server, because EB will spin up another server to handle rerouted traffic. These characteristics of  Elastic Beanstalk exposed some of the present shortcomings and future strengths of the Let's Encrypt project.

Here's the mission statement of the project:
Let’s Encrypt is a free, automated, and open certificate authority (CA), run for the public’s benefit.
While most of us focus on the word "free", the more significant word here is "automated":
Automatic: Software running on a web server can interact with Let’s Encrypt to painlessly obtain a certificate, securely configure it for use, and automatically take care of renewal.
Note that the objective is not to make it painless for website administrators to obtain a certificate, but to enable software to get certificates. If the former is what you want, in the near term, then I strongly recommend that you spend some money with one of the established certificate authorities. You'll get a certificate that isn't limited to 90 days, as the LE certificates are, you can get a wildcard certificate, and you'll be following the manual procedure that your existing web server software expects you to be following.

The real payoff for Let's Encrypt will come when your web server applications start expecting you to use the LE methods of obtaining security certificates. Then, the chore of maintaining certificates for secure web servers will disappear, and things will just work. That's an outcome worth waiting for, and worth working towards today.

So here's how I got Let's Encrypt working with Elastic Beanstalk for

The key thing to understand here is that before Let's Encrypt can issue me a certificate, I have to prove to them that I really control the hostname that I'm requesting a certificate for. So the Let's Encrypt client has to be given access to a "privileged" port on the host machine designated by DNS for that hostname. Typically, that means I have to have root access to the server in question.

In the future, Amazon should integrate a Let's Encrypt client with their Beanstalk Apache server software so all this is automatic, but for now we have to use the Let's Encrypt "manual mode". In manual mode, the Let's Encrypt client generates a cryptographic "challenge/response", which then needs to be served from the root directory of the web server.

Even running Let's Encrypt in manual mode required some jumping through hoops. It won't run on Mac OSX. It doesn't yet support the flavor of Linux used by Elastic Beanstalk, so it does no good configuring Elastic Beanstalk to install it there. Instead I used the Let's Encrypt Docker container, which works nicely, and I ran a Docker-Machine inside "virtualbox" on my Mac.

Having configured Docker, I ran
docker run -it --rm -p 443:443 -p 80:80 --name letsencrypt \    
-v "/etc/letsencrypt:/etc/letsencrypt" \
-v "/var/lib/letsencrypt:/var/lib/letsencrypt" \ -a manual -d \
--server auth
(the --server option requires your domain to be whitelisted during the beta period.) After paging through some screens asking for my email address and permission to log my IP address, the client responded with
Make sure your web server displays the following content at before continuing:
To do this, I configured a virtual directory "/.well-known/acme-challenge/" in the Elastic Beanstalk console with mapped to a "letsencrypt/" directory in my application. I then made a file named  "8wBDbWQIvFi2bmbBScuxg4aZcVbH9e3uNrkC4CutqVQ" with the specified content in my letsencrypt directory, committed the change with git, and deployed the application with the elastic beanstalk command line interface. After waiting for the deployment to succeed, I checked that responded correctly, and then hit <enter>. (Though the LE client tells you that the MIME type "text/plain" MUST be sent, elastic beanstalk sets no MIME header, which is allowed.)

IMPORTANT NOTES:  - Congratulations! Your certificate and chain have been saved at    /etc/letsencrypt/live/ Your cert    will expire on 2016-02-08. To obtain a new version of the    certificate in the future, simply run Let's Encrypt again.
...except since I was running Docker inside virtualbox on my Mac, I had to log into the docker machine and copy three files out of that directory (cert.pem, privkey.pem, and chain.pem). I put them in my local <.elasticbeanstalk> directory. (See this note for a better way to do this.)

The final step was to turn on HTTPS in elastic beanstalk. But before doing that, I had to upload the three files to my AWS Identity and Access Management Console. To do this, I needed to use the aws command line interface, configured with admin privileges. The command was
aws iam upload-server-certificate \ --server-certificate-name gitenberg-le \ --certificate-body file://<.elasticbeanstalk>/cert.pem \ --private-key file://<.elasticbeanstalk>/privkey.pem \ --certificate-chain file://<.elasticbeanstalk>/chain.pem
One more trip to the Elastic Beanstalk configuration console (network/load balancer section), and was on HTTPS.

Given that my sys-admin skills are rudimentary, the fact that I was able to get Let's Encrypt to work suggests that they've done a pretty good job of making the whole process simple. However, the documentation I needed was non-existent, apparently because the LE developers want to discourage the use of manual mode. Figuring things out required a lot of error-message googling. I hope this post makes it easier for people to get involved to improve that documentation or build support for Let's Encrypt into more server platforms.

(Also, given that my sys-admin skills are rudimentary, there are probably better ways to do what I did, so beware.)

If you use web server software developed by others, NOW is the time to register a feature request. If you are contracting for software or services that include web services, NOW is the time to add a Let's Encrypt requirement into your specifications and contracts. Let's Encrypt is ready for developers today, even if it's not quite ready for rank and file IT administrators.

Update (11/12/2015):
I was alerted to the fact that while was working, was failing authentication. So I went back and did it again, this time specifying both hostnames. I had to guess at the correct syntax. I also tested out the suggestion from the support forum to get the certificates saved in may mac's filesystem. (It's worth noting here that the community support forum is an essential and excellent resource for implementers.)

To get the multi-host certificate generated, I used the command:
docker run -it --rm -p 443:443 -p 80:80 --name letsencrypt \
-v "/Users/<my-mac-login>/letsencrypt/etc/letsencrypt:/etc/letsencrypt" \
-v "/Users/<my-mac-login>/letsencrypt/etc/letsencrypt/var/lib/letsencrypt:/var/lib/letsencrypt" \
-v "/Users/<my-mac-login>/letsencrypt/var/log/letsencrypt:/var/log/letsencrypt" \ -a manual \
-d -d \
--server auth
This time, I had to go through the challenge/response procedure twice, once for each hostname.

With the certs saved to my filesystem, the upload to AWS was easier:
aws iam upload-server-certificate \
--server-certificate-name gitenberg-both \
--certificate-body file:///Users/<my-mac-login>/letsencrypt/etc/letsencrypt/live/ \
--private-key file:///Users/<my-mac-login>/letsencrypt/etc/letsencrypt/live/ \
--certificate-chain file:///Users/<my-mac-login>/letsencrypt/etc/letsencrypt/live/

Thursday, October 22, 2015

This is NOT a Portrait of Mary Astell

Not Mary Astell, by Sir Joshua Reynolds
Ten years ago, the University of Calgary Press published a very fine book by Christine Mason Sutherland called The Eloquence of Mary Astell, which focused on the proto-feminist's contributions as a rhetorician. The cover for the book featured a compelling image using a painted sketch from 1760-1765 by the master English portraitist Sir Joshua Reynolds, currently in Vienna's Kunsthistorisches Museum and known as Bildnisstudie einer jungen Dame (Study for the portrait of a young woman).

Cover images from books circulate widely on the internet. They are featured in online bookstores, they get picked up by search engines. Inevitably, they get re-used and separated from their context. Today (2015) "teh Internetz" firmly believe that the cover image is a portrait of Mary Astell.

For example:

If you look carefully, you'll see that the image most frequently used is the book cover with the title inexpertly removed.

But the painting doesn't depict Mary Astell. It was done 30 years after her death. In her book, Sutherland notes (page xii):
No portrait of her remains, but such evidence as we have suggests that she was not particularly attractive. Lady Mary Wortley Montagu’s granddaughter records her as having been “in outward form [...] rather ill-favoured and forbidding,” though Astell was long past her youth when this observation was made

Wikipedia has successfully resisted the misattribution.

A contributing factor for the confusion about Mary Astell's image is the book's failure to attribute the cover art. Typically a cover description is included in the front matter of the book. According to the Director of the University of Calgary Press, Brian Scrivener, proper attribution would certainly be done in a book produced today. Publishers now recognize that metadata is increasingly the cement that makes books part of the digital environment. Small presses often struggle to bring their back lists up to date, and publishers both large and small have "metadata debt" from past oversights, mergers, reorganizations and lack of resources.

Managing cover art and permissions for included graphics is often an expensive headache for digital books, particularly for Open Access works. I've previously written about the importance of clear licensing statements and front matter in ebooks. It's unfortunate when public domain art is not recognized as such, as in Eloquence, but nobody's perfect.

The good news is that University of Calgary Press has embraced Open Access ebooks in a big way. The Eloquence of Mary Astell and 64 other books are already available, making Calgary one of the world's leading publishers of Open Access ebooks. Twelve more are in the works.

You can find Eloquence at the Calgary University Press website (including the print edition), Unglue.itDOAB, and Internet Archive. Mary Astell's 1706 pamphlet Reflections Upon Marriage can be found at the Internet Archive and at the University of Pennsylvania's Celebration of Women Writers.

And maybe in 2025, teh internetz will know all about Sir Joshua Reynold's famous painting, Not Mary Astell. Happy Open Access Week!

Saturday, September 26, 2015

Weaponization of Library Resources

This post needs a trigger warning. You probably think the title indicates that I've gone off the deep end, or that this is one of my satirical posts. But read on, and I think you'll agree with me, we need to make sure that library resources are not turned into weapons. I'll admit that sounds ludicrous, but it won't after you learn about "The Great Cannon" and "QUANTUM".

But first, some background. Most of China's internet connects to the rest of the world through what's known in the rest of the world as "the Great Firewall of China". Similar to network firewalls used for most corporate intranets, the Great Firewall is used as a tool to control and monitor internet communications in and out of China. Websites that are deemed politically sensitive are blocked from view inside China. This blocking has been used against obscure and prominent websites alike. The New York Times, Google, Facebook and Twitter have all been blocked by the firewall.

When web content is unencrypted, it can be scanned at the firewall for politically sensitive terms such as "June 4th", a reference to the Tiananmen Square protests, and blocked at the webpage level. China is certainly not the only entity that does this; many school systems in the US do the same sort of thing to filter content that's considered inappropriate for children. Part of my motivation for working on the "Library Digital Privacy Pledge" is that I don't think libraries and publishers who provide online content to them should be complicit in government censorship of any kind.

Last March, however China's Great Firewall was associated with an offensive attack. To put it more accurately, software co-located with China's Great Firewall turned innocent users of  unencrypted websites into attack weapons. The targets of the attack were "", a website that works to provide Chinese netizens a way to evade the surveillance of the Great Firewall, and, the website that hosts code for hundreds of thousand of programmers, including those supporting

Here's how the Great Cannon operated  In August, Bill Marczak and co-workers from Berkeley, Princeton and Citizen Lab presented their findings on the Great Cannon at the 5th USENIX Workshop on Free and Open Communications on the Internet.
The Great Cannon acted as a "man-in-the-middle"[*] to intercept the communications of users outside china with servers inside china. Javascripts that collected advertising and usage data for Baidu, the "Chinese Google", were replaced with weaponized javascripts. These javascripts, running in the browsers of internet users outside China, then mounted the denial-of-service attack on and Github.
China was not the first to weaponized unencrypted internet traffic. Marczak et. al. write:
Our findings in China add another documented case to at least two other known instances of governments tampering with unencrypted Internet traffic to control information or launch attacks—the other two being the use of QUANTUM by the US NSA and UK’s GCHQ.[reference] In addition, product literature from two companies, FinFisher and Hacking Team, indicate that they sell similar “attack from the Internet” tools to governments around the world [reference]. These latest findings emphasize the urgency of replacing legacy web protocols like HTTP with their cryptographically strong counterparts, such as HTTPS.
It's worth thinking about how libraries and the resources they offer might be exploited by a man-in-the-middle attacker. Science journals might be extremely useful in targeting espionage scripts at military facilities, for example. A saboteur might alter reference technical information used by a chemical or pharmaceutical company with potentially disastrous consequences. It's easy to see why any publisher that wants its information to be perceived as reliable has no choice but to start encrypting their services now.

The unencrypted services of public libraries are attractive targets for other sorts of mischief, ironically because of their users' trust in them and because they have a reputation for protecting privacy. Think about how many users would enter their names, phone numbers, and last four digits of their social security numbers if a library website seemed to ask for it. When a website is unencrypted, it's possible for "man-in-the-middle" attacks to insert content into an unencrypted web page coming from a library or other trusted website. An easy way for an attacker to get into position to execute such an attack is to spoof a wifi network, for example in a cafe or other public space, such as a library. It doesn't help if only a website's login is encrypted if an attacker can easily insert content into the unencrypted parts of the website.

To be clear, we don't know that libraries and the type of digital resources they offer are being targeted for weaponization, espionage or other sorts of mischief. Unfortunately, the internet offers a target-rich environment of unencrypted websites.

I believe that libraries and their suppliers need to move swiftly to take the possibility off the table and help lead the way to a more secure digital environment for us all.

Tuesday, September 8, 2015

Hey, Google! Move Blogspot to HTTPS now!

Since I've been supporting a Library Privacy Pledge to implement HTTPS, I've made an inventory of the services I use myself, to make sure that all the services I use will by HTTPS by the end of 2016. The main outlier: THIS BLOG!

This is odd, because Google, the owner of Blogger and Blogspot, has made noise about moving its services to HTTPS, marking HTTP pages as non-secure, and is even giving extra search engine weight to webpages that use HTTPS.

I'd like to nudge Google, now that it's remade its logo and everything, to get their act together on providing secure service for Blogger. So I set the "description" of my blog to "Move Blogspot to HTTPS NOW." If you have a blog on Blogspot, you can do the same. Go to your control panel and click settings. "description" is the second setting at the top. Depending on the design of your page, it will look like this:

So Google, if you want to avoid a devastating loss of traffic when I move Go-To-Hellman to another platform on January 1, 2017, you better get cracking. Consider yourself warned.

Update 10/26/2015. The merciless pressure from the Go-To-Hellman blog worked. Blogger now supports HTTPS.

Sunday, August 30, 2015

Update on the Library Privacy Pledge

The Library Privacy Pledge of 2015, which I wrote about previously, has been finalized. We got a lot of good feedback, and the big changes have focused on the schedule.

Now, any library , organization or company that signs the pledge will have 6 months to implement HTTPS from the effective date of their signature. This should give everyone plenty of margin to do a good job on the implementation.

We pushed back our launch date to the first week of November December. That's when we'll announce the list of "charter signatories". If you want your library, company or organization to be included in the charter signatory list, please send an e-mail to

The Let's Encrypt project will be launching soon. They are just one certificate authority that can help with HTTPS implementation.

I think this is an very important step for the library information community to take, together. Let's make it happen.

Here's the finalized pledge:

The Library Freedom Project is inviting the library community - libraries, vendors that serve libraries, and membership organizations - to sign the "Library Digital Privacy Pledge of 2015". For this first pledge, we're focusing on the use of HTTPS to deliver library services and the information resources offered by libraries. It’s just a first step: HTTPS is a privacy prerequisite, not a privacy solution. Building a culture of library digital privacy will not end with this 2015 pledge, but committing to this first modest step together will begin a process that won't turn back.  We aim to gather momentum and raise awareness with this pledge; and will develop similar pledges in the future as appropriate to advance digital privacy practices for library patrons.

We focus on HTTPS as a first step because of its timeliness. The Let's Encrypt initiative of the Electronic Frontier Foundation will soon launch a new certificate infrastructure that will remove much of the cost and technical difficulty involved in the implementation of HTTPS, with general availability scheduled for September. Due to a heightened concern about digital surveillance, many prominent internet companies, such as Google, Twitter, and Facebook, have moved their services exclusively to HTTPS rather than relying on unencrypted HTTP connections. The White House has issued a directive that all government websites must move their services to HTTPS by the end of 2016. We believe that libraries must also make this change, lest they be viewed as technology and privacy laggards, and dishonor their proud history of protecting reader privacy.

The 3rd article of the American Library Association Code of Ethics sets a broad objective:

We protect each library user's right to privacy and confidentiality with respect to information sought or received and resources consulted, borrowed, acquired or transmitted.
It's not always clear how to interpret this broad mandate, especially when everything is done on the internet. However, one principle of implementation should be clear and uncontroversial:
Library services and resources should be delivered, whenever practical, over channels that are immune to eavesdropping.

The current best practice dictated by this principle is as following:
Libraries and vendors that serve libraries and library patrons, should require HTTPS for all services and resources delivered via the web.

The Pledge for Libraries:

1. We will make every effort to ensure that web services and information resources under direct control of our library will use HTTPS within six months. [ dated______ ]

2. Starting in 2016, our library will assure that any new or renewed contracts for web services or information resources will require support for HTTPS by the end of 2016.

The Pledge for Service Providers (Publishers and Vendors):

1. We will make every effort to ensure that all web services that we (the signatories) offer to libraries will enable HTTPS within six months. [ dated______ ]

2. All web services that we (the signatories) offer to libraries will default to HTTPS by the end of 2016.

The Pledge for Membership Organizations:

1. We will make every effort to ensure that all web services that our organization directly control will use HTTPS within six months. [ dated______ ]

2. We encourage our members to support and sign the appropriate version of the pledge.

There's a FAQ available, too. The pledge is now posted on the Library Freedom Project website. (updated 9/14/2015)