Tuesday, April 1, 2014

Library Group to Acquire Readmill Assets

uixoticTechcrunch is reporting that a group of librarians known as the "ALA ThinkTank" has acquired the assets of shuttered startup Readmill. The new owners will turn the website and apps into a "library books in the cloud" site for library patrons.

Over the weekend, Readmill announced that it had been "acqui-hired" by cloud storage startup DropBox, and that its app and website would cease functioning on July 1. "Many challenges in the world of ebooks remain unsolved, and we failed to create a sustainable platform for reading" said Readmill founder Henrik Berggren, in his farewell message to site members. "Failure to have a sustainable platform for reading really resonates with librarians" responded ThinkTank co-founder J.P. Porcaro. "It's a match made in heaven - devoted users, quixotic economics, and lots of books to distract the staff." Porcaro will serve as CEO of the new incarnation of Readmill.

New Readmill CEO J. P. Porcaro
The acquisition also solves a problem Porcaro had been wrestling with- how to spend the group's Bitcoin millions. Far from its present incarnation as a Social Enterprise/Facebook Group hybrid, ALA ThinkTank originated as a solution for housing destitute librarians from New Jersey during the bi-annual conventions of the American Library Association. The group figured that by renting a house instead of renting hotel rooms, they could save money, learn from peers and throw great parties. The accompanying off-the-grid commerce in "assets" was never intended- it just sort of happened.

One of the librarians was friends with Penn State grad student Ross Ulbricht, who convinced the group to use Bitcoin for the purchase and sale of beer, pizza and "ebooks". "He kept talking about  piracy and medieval trade routes" reported Porcaro, "We thought he was normal ... though in retrospect it was kinda weird when he asked about using hitmen to collect overdue book fines."

The 10,000 fold increase in the value of ThinkTank's Bitcoin account over the past four years caught almost everyone completely off guard. The parties, which in past years were low rent, jeans-and-cardigan affairs, have morphed into multi-story "party hearty" extravaganzas packed with hipster librarians body-pierced with bitcoin encrusted baubles and wearing precious-metal badge ribbons.

Porcaro expects that Readmill's usage will skyrocket with the new management. He thinks that ALA ThinkTank's heady mix of critical pedagogy, "weeding" advice, gaming makerspaces, drink-porn, management theory, gender angst and a whiff of scandal are sure to "make it happen" for the moribund social reading site, which has suffered from the general boringness of books.

ThinkTank members are already hard at work planning the transition. A 13-step procedure that will allow Readmill users to keep their books exactly as they are has been spec-ed out by one library vendor. "If you like your ebooks you can keep them" Porcaro assured me. "If you don't like them, we can send them to India for you. Or Lafourche, Louisiana, your choice."

The backlash against the new Readmill has already begun. "Library books in the cloud is the dumbest thing I've ever heard of. How will people know which bits are theirs, and which need to be returned? How will we do inter-library loan? What will happen if it rains?" complained one senior library director who declined to be identified. "How will we get our books returned then?" she asked. "I don't even know HOW to hire a hitman."

In a press release, Scott Turow, past president of the Authors Guild, expressed his horror at the idea of "library books in the cloud." "Once again, librarians are scheming to take food out of the mouths of authors emaciated by hunger. These poor authors are dying miserable deaths, knowing that their copyrighted works are being misused and unread in this way. Library books in a cloud of nerve gas, more like!"

The American Library Association, which is completely unaffiliated with ALA ThinkTank, has formed a committee to study the cloud library ebook phenomenon.

Enhanced by Zemanta

Thursday, March 27, 2014

The Asterisk behind NYPL and Bookish

Joe Regal grew up in a family that moved around. Granger, Indiana.  Lewiston, New York. Towanda, Pennsylvania. In every town there was a library, which young Joe would seek out as a haven of virtual stability. Regal remembers that in Fairfield, Connecticut, he picked up Breakfast of Champions, because the cover looked like a cereal box. He opened it and was thrilled to discover, right there on page 5, the "drawing" of an anus/asterisk. And the text of Kurt Vonnegut's novel was even more subversive than the drawing.
"Vonnegut was one of those writers who made me feel less alone.  He also made me understand that it was OK to break the rules, because often the rules were insane.  That message - captured even in the asterisk/anus drawing, though of course more deeply, richly, and powerfully in the actual writing! - meant so much to me at 13, it's hard to convey or even fully remember the totality of it.  The freedom, the sense that you could explore without fear of punishment or retribution - that's a lot of what the library meant to me as a kid.  It's easy for us to forget as adults that a book can literally save your life. Or even on a more prosaic level, if there was literally no cost to taking out a book, I could take out anything without worrying whether it was right for me. I could browse, read a bit, take it out, get bored, return it."
As an adult Joe Regal translated his passion for books to a successful career as a literary agent. He believed so deeply in Audrey Niffenegger's The Time Travelers Wife that he ignored countless rejections until he found a publisher for it. ("I do not publish science-fiction." was the complete text of one rejection.)

As an agent, Regal could see first-hand what ebooks and Amazon were doing to the ability of authors, publishers and bookstores to sustain their livelihoods. He thought about what an seller of ebooks could and should be. There should be space for curation and community. Authors should be able to connect with readers. As he talked with others about his ideas, the concept of a new kind of website for ebooks began to take shape. (I got to know Regal and his family around this time.)

A few years later, Zola Books is a reality. Initially funded by friends of Regal (including Niffenegger), Zola has recently closed a $5.1 million seed round. The round includes a variety of authors and prominent individual investors led by Charles Dolan, founder of Cablevision and HBO. Even considering the funding, Zola's ambition is breathtaking. They've built a commerce platform like BN.com, a social platform like GoodReads, an HTML epub reader with proprietary DRM (not yet launched), and partner curation tools like- (stretching a bit) sort of a TripAdvisor for books. Not to mention a solid catalog of ebooks.

A recommendation engine has been a big space on the Zola development roadmap from the beginning. It's not easy technology, so when the recommendation engine built by Bookish became available (along with the Bookish website) at a fraction of its development cost, Zola, newly funded and in a hurry, snapped it up at a bargain-basement price.

The Bookish recommendation engine uses "finger-prints" of books in its algorithm. In other words, it works more like Pandora than like Netflix. The fingerprints are not just metadata and are not just text analysis, but use elements of both along with human-powered analysis.

recommendations for
Breakfast of Champions
On Monday, New York Public Library announced that it had integrated the Bookish-powered recommendation engine into their NYPL BiblioCommons-powered web catalog, fulfilling Regal's dream of being able to give back to the libraries he loved growing up, opening up unexpected books like Breakfast of Champions to new generations of readers.  The recommendations are live on the NYPL website, so you can decide for yourself if the recommendations are good or not. I found them to be intriguing, at least.

Apparently NYPL has been looking to add a recommendation feature to its website for a few years. They tracked potential partners along with Bookish to determine the best option, and had the benefit of seeing some advance demos before "Bookish Recommends" launched online. NYPL was impressed by Bookish's "big data back-end" and that it was not driven by sales; the number of titles the it covered at the outset was impressive.  NYPL will be assessing  performance over the first year to ensure that the recommendations are valuable to readers.

According to Patrick Kennedy, Co-founder and President at BiblioCommons,
"The background to this story is the interest a number of libraries have shared with us in broadening their role as a source of book recommendations in their communities.  The initiative will allow for better visibility and sharing of librarian recommendations and reviews, the integration of other third-party recommendations databases such as LibraryThing and NoveList.  Our goal is provide a neutral platform that allows libraries to integrate the sources of their choice.  In all cases the integration API is made available by the third parties to BiblioCommons with the understanding that any library on the BiblioCommons platform may license the content."
Zola is hoping to make the Bookish API widely available to libraries and is considering a variety of licensing models. As Kennedy points out, there are recommendation services already available to libraries. The LibraryThing service (marketed by Bowker), is based on activities in the LibraryThing social network and is incredibly deep; the NoveList service from EBSCO takes a more traditional reader's advisory approach. The Bookish recommendation engine may not be based on sales the way Amazon's is, but if it doesn't help Zola sell ebooks, it will die. Can the mission of a library be advanced by using a tool whose ultimate purpose is to sell books? Or does it depend on the sort of bookseller behind the tool?

This conflict is probably why booksellers and libraries haven't been sharing as much book information infrastructure as you might expect. A library has different goals for a recommendation system than does a bookseller. Libraries need to steer users toward books of their collection that are less used, while booksellers need to present the user with books that the patron is most likely to buy. Which might ALWAYS be 50 Shades or Hunger Games.

But bookselling and libraries are both changing rapidly. With the big-box bookstore dying before their eyes, publishers are scrambling to find ways to continue putting books in front of readers. One possibility is that libraries will respond to this need and evolve a closer connection to commerce, and that booksellers will figure out how to tighten their connections to communities and their libraries. The alternative is that libraries and ebookstores grow apart to serve very different populations and needs – Amazon Prime and library subprime, if you will.

My guess is that libraries sharing infrastructure with booksellers will become the norm rather than the exception it is now. Monday's announcement by NYPL and Zola is more than just a website usability widget, it's about a vision of what libraries and booksellers can become. Zola has sent a love letter to the library world.

Notes

  1. Bookish.com started out as a joint venture of Penguin, Hachette, and Simon & Schuster. Bookish spent a vast amount of money developing the site.
  2. Competition between LibraryThing and Bookish might well lead to some changes. Bookish uses some content from LibraryThing, such as reviews, on its website. When Bookish launched, LibraryThing founder Tim Spalding wrote 
    Besides reviews, Bookish has access to some other LibraryThing data, including edition disambiguation and recommendations. A glance at their recommendations, however, will show you that they're not using them "cold," but as some sort of factor."
  3. I wrote about BiblioCommons when they came out of stealth a few years ago. They've won the business of some very high profile public Libraries, NYPL and Seattle Public Library included. They have the big  benefit of starting from scratch with current web technology, and as a result have been innovating quickly.
  4. I took a look at how the integration was done. The Bookish API is a straightforward REST and JSON with access keys. ISBN-based queries such as

    http://api.bookish.com/recapi/api/v1/recommendations?maxItems=15&token=<token>&apiKey=<key>&isbn13s=9780670024902

    return JSON like:
    [{
      "basic": {
        "isbn13": "9780671742515",
        "bookUrl": "http://www.bookish.com/books/long-dark-tea-time-of-the-soul-douglas-adams-9780671742515/<token>",
        "imageUrl": "http://images.bookish.com/covers/m/9780671742515.jpg",
        "title": "Long Dark Tea-Time of the Soul",
        "subtitle": "",
        "authors": ["Douglas Adams"]
      }
    }]


    The library-side integration done by BiblioCommons is ajaxy and javascript based; a javascript calls the api, pulls out the ISBNs and sends them back to BiblioCommons, which checks for the recommended ISBN in the catalog. A list of holdings is sent back to the browser for rendering. It looks like Bibliocommons itself does not call the bookish API, which could lend itself to easier integration with other recommender APIs.
  5. Another interesting recommender system in the library world is bX from ExLibris. It's a usage based system focused on article links, rather than books. Currently, bX will return book recommendations based on articles, but doesn't provide recommendations based on books.
  6. Don't confuse Bookish.com, the company acquired by Zola Books with booki.sh, the company acquired by Overdrive
  7. Not that we haven't had this problem at unglue.it, but why does NYPL list Robert Egan as the author of the ebook version of Breakfast of Champions? (Update: Answer from Amy Geduldig at NYPL- "The catalog entry here refers to the play Breakfast of Champions by Robert Egan, which is based on the novel by Vonnegut, but in and of itself is a different work, which is why Egan is listed as the author. ")
  8. All the book links in this post point at the NYPL BiblioCommons catalog so you can see try out Bookish Recommends for yourself.
Enhanced by Zemanta

Saturday, March 22, 2014

eBook ILL is silly. The reason why will bore you.

When we try to think about digital things as if they are still the real things they used to be, we can lose touch with the parts of reality that are important. It's silly.

If you're not of the library world, let me explain what ebook ILL is and why it's not silly per se. ILL stands for Inter-Library Loan. In the print world, libraries have finite collections and they depend on other libraries to make sure that even if they don't have a book that a user needs, another library will step in and fill the gap. For the user, it means that their small library can provide them with books from a huge virtual collection. A book might take a few days to arrive.

There are significant costs involved in ILL. Most libraries charge the borrowing library a fee to cover the expenses of packaging the book and sending it to the recipient library. The fee might be 10 or 20 dollars, and it might be waived for closely cooperating libraries. At the same time, libraries pay the same fees to other libraries, so in the end, it all evens out. But many libraries run a significant surplus, rewarding them for smart acquisition policies of the past.

Library lending cooperatives have figured out that the combination of Amazon and modern warehouse logistics have partly upset the economics of ILL; a library can often purchase a used copy of a needed book on Amazon for less than ILL transaction costs (Especially with Amazon Prime!). But ILL is still an important part of the library ecosystem.

For digital content, the buy vs. borrow equation shifts back a bit. In principle, there's no shipping cost and modern databases can retrieve a digital item in milliseconds. But if a library can do digital ILL, what is to prevent libraries from sharing a resource so widely that only one library in the world needs to buy the item?

The solution that e-journal publishers typically use is the "print-and-ship" solution. In other words, a library is allowed to send articles from a subscribed journal only if they print it out first. The transaction is thus identical to what it was back in the dark ages of ink and paper and xerox machines. For publishers, the friction of print-and-ship discourages libraries from canceling subscriptions; besides, the big-deal model of bundling many subscriptions into one has been much more advantageous for publishers than the document-delivery model that ILL competes with. (Also, when they first went digital, journal publishers were poorly equipped to do article-by-article e-commerce.)

Printing article PDFs and mailing them is a stretch, but mapping this model into ebooks is a farther stretch. The book ecosystem has never included libraries creating copies of in-print printed books. And why should library A ever acquire a book if the copy owned by library B works just as well? Since most ebooks never really go "out-of-print", the inter-library loan system will be competing directly with publisher sales.

To see why it still makes sense for publishers to allow ebook ILL, consider what it is competing against: "patron-driven acquisition" (PDA). The core idea behind PDA is that a library doesn't buy an ebook until a patron shows up that wants to use it. For many books that libraries buy, this means that they don't buy the book at all, and for the rest, there might be no purchase until many years after publication.

It's often better for the publisher to encourage "just-in-case" acquisition, because the resulting revenue can be put to work immediately to publish more books. For books with low demand, inter-library loan encourages just-in-case acquisition by increasing the likelihood that somewhere, sometime, someone will need the library's copy of even the most obscure book.

eBook licensing with ILL has very similar economic characteristics as licensing to a library with many users. The larger the library, the more demand can be aggregated, and thus books can remain economically viable even at very low levels of user demand. In the limit of large user bases, ILL looks very much like the Open Access collective funding such as has been demonstrated by Knowledge Unlatched.

"Just-in-case" acquisition has benefits for libraries, too. Coupled with an effective archiving strategy, the library can make sure a resource doesn't disappear if a publisher has to withdraw an ebook title. Or perhaps the publisher goes out of business, or decides to change their business model.

But ebook ILL is still silly. Admittedly, the one-user-at-a-time licensing model has proven to be a useful conceit for selling ebooks. People are used to paying for a copy of a book, so it seems natural to buy a copy of an ebook. But stretching that model to inter-library lending turns the conceit into an outright lie. Just because one library has bought an ebook copy doesn't mean that they should be able to lend it instantly to anyone in the world.

Clinging to the pretend-its-print conceit when developing licensing models for "just-in-case" acquisition results in harmful misunderstandings for both publishers and for libraries. Publishers focus on sales substitution, and libraries misunderstand what they're paying for. Pricing and terms for such licenses will better benefit both libraries and publishers if the license is seen for what it really is rather than something it's pretending to be. There's no sense in locking in the negative attributes of the old when developing the new.

So maybe we need a new acronym. How about "Interacting Libraries License"?

ILL is dead, long live ILL.

Enhanced by Zemanta

Saturday, March 1, 2014

The DMCA Takedown of a Feynman Lectures eBook Converter


The Feynman Lectures on Physics was one of my favorite textbooks in college. It wasn't the assigned textbook, it was recommended reading. I think the reason it doesn't work as a textbook is that every chapter is so deep that students would get sucked so far into every topic that they would never finish the course. It's the sort of book that transforms your life and way of thinking about the physical world. When I started Unglue.it, The Feynman Lectures was one of the first books I investigated for ungluing.

My friends at Caltech informed me that the rights situation with the Feynman Lectures was exceedingly complicated, and it would be a cold day in hell before the Feynman Lectures would be free to the world in digital form. It seems that Caltech and the book publishing world had made an awful hash of the rights, with print rights being owned by Pearson, and the audiovisual rights being owned by competing publisher Perseus. Heroic efforts by Caltech lawyer Adam Cochrane and some dedicated physicists and educators resulted in the untangling of rights, leading to a revised edition available through Perseus imprint Basic Books.

And last year, a miracle happened. An authorized free digital version of the lectures appeared on the web! There is sanity in the world! The Feynman Lectures had been unglued!

Vikram Verma, a software developer in Singapore, wanted to be able to read the lectures on his kindle. Although PDF versions can be purchased at $40 per volume, no versions are yet available in Kindle or EPUB formats. Since the digital format used by kindle is just a simplified version of html, the transformation of web pages to an ebook file is purely mechanical. So Verma proceeded to write a script to do the mechanical transformation – he accomplished the transformation in only 136 lines of ruby code, and published the script as a repository on Github.

Despite the fact that nothing remotely belonging to Perseus or Caltech had been published in Verma's repository, it seems that Perseus and/or Caltech was not happy that people could use Verma's code to easily make ebook files from the website. So they hauled out the favorite weapon of copyright trolls everywhere: a DMCA takedown.

I am not a lawyer, but I think that this use of a DMCA takedown was improper and possibly illegal. I'm pretty certain that use of Verma's script for personal use would be protected fair use in the United States, under Betamax. There are no terms of use at the Feynman Lectures website for Verma's script to violate; there wasn't even a robots exclusion. So even a legal theory that Verma's code was inducing others to violate website terms falls flat on its face.  But alas, there's no penalty for abusive DMCA takedowns, so Perseus' main downside is having to read annoying blog posts like this one. And Perseus does need to look out for their authors' rights – they probably aren't in a position to asses what some ruby code does.

Luckily, Github has a policy of publishing every DMCA takedown notice it receives, which is how I found out about Perseus' action, and Verma's counternotice. Perseus had 10 days to respond to the counter-notice and since they failed to do so, Github has re-opened the repository.

In the meantime, the Feynman Lectures website has taken some steps to break Verma's script. For example, instead of a link to http://www.feynmanlectures.caltech.edu/II_28.html (my favorite chapter), the table of contents now has a link to javascript:Goto(2,18). This will take about 10 minutes for Verma to work around. In addition, the website now has a robot exclusion (except for Googlebot).

Michael Gottlieb, the editor of The Feynman Lectures on Physics New Millennium Edition added this issue to the repo:
The online edition of The Feynman Lectures Website posted at www.feynmanlectures.caltech.edu and www.feynmanlectures.info is free-to-read online. However, it is under copyright. The copyright notice can be found on every page: it is in the footer that your script strips out! The online edition of FLP can not be downloaded, copied or transferred for any purpose (other than reading online) without the written consent of the copyright holders (The California Institute of Technology, Michael A. Gottlieb, and Rudolf Pfeiffer), or their licensees (Basic Books). Every one of you is violating my copyright by running the flp.mobi script. Furthermore Github is committing contributory infringement by hosting your activities on their website. A lot of hard work and money and time went into making the online edition of FLP. It is a gift to the world - one that I personally put a great deal of effort into, and I feel you are abusing it. We posted it to benefit the many bright young people around the world who previously had no access to FLP for economic or other reasons. It isn't there to provide a source of personal copies for a bunch of programmers who can easily afford to buy the books and ebooks!! Let me tell you something: Rudi Pfeiffer and I, who have worked on FLP as unpaid volunteers for about a decade, make no money from the sale of the printed books. We earn something only on the electronic editions (though, of course, not the HTML edition you are raping, to which we give anyone access for free!), and we are planning to make MOBI editions of FLP - we are working on one right now. By publishing the flp.mobi script you are essentially taking bread out of my mouth and Rudi's, a retired guy, and a schoolteacher. Proud of yourselves? That's all I have to say personally. Github has received DMCA takedown notices and if this script doesn't come down pretty soon they (and very possibly you) might be hearing from some lawyers. As of Monday, this matter is in the hands of Perseus's Domestic Rights Department and Caltech's Office of The General Counsel. 
Michael A. Gottlieb
Editor, The Feynman Lectures on Physics New Millennium Edition
www.feynmanlectures.info
www.feynmanlectures.caltech.edu

(Note: Gottlieb's description of the website copyright notice is inaccurate- it says nothing about "downloaded, copied or transferred for any purpose")

This is kind of sad. Here Caltech did the right and noble thing and made the Feynman Lectures free as a website. That they can make money from the work via sales of print and other versions is great. But having done that, trying to control what people do with the free digital version (other than sell it) is a hopeless endeavor, and they should just stop.

I was wrong. The Feynman Lectures hasn't been unglued.

Update, March 3: Verma made a one-line change to the script to un-break it. But it's not a polite script, so don't all go and run it. Better to ask Caltech to use the script to make epubs and mobi's for sale; I would certainly pay for my DRM-free copy!

Update, March 4: Gottlieb e-mailed me to say that Perseus didn't respond to the counter-notice because Github's email notice went to a spam filter, and that more takedowns would be coming. He seemed to think that I am one of the flp.mobi developers and warned that I have put myself "in a precarious legal position". To me clear, I am not involved in the development or publication of flp.mobi. I hope its existence is not used as a pretext to take down or lock down the FLP website. Also, high-quality epub and mobi are on the way!

Update, March 7: Verma e-mailed me to say he is voluntarily taking down his repo:
I'm taking down my copy of the repository on Monday morning, in worry its continued availability will lead Caltech to discontinue free online access to FLP. You're each welcome to adopt maintainership if you prefer, though I would rather if you did not.
Techdirt has a post and commentary.

Update, March 10: Verma's repo is now history, but forks of it remain in 15 places, including, bizarrely, Gottlieb's own Github page
Enhanced by Zemanta

Friday, February 28, 2014

Open Access Honesty

I've spent a large part of February becoming acquainted with Open Access ebook publishers. And the one thing that troubles me is that too many of them are not putting honesty first. Because existing distribution channels do not reward forthrightness in Open Access publishers; in fact the channels actively discourage it.

Let's take Amazon, for example. They don't like free ebooks, because there's no money in it for them. If you're a publisher and you want your ebook to be free for people to load onto their kindles, Amazon will charge you for the privilege. They rationalize that they're paying for a separate wireless network, "Whispernet", so it's only fair to assess "delivery charges" to free  ebook publishers. If you use their 70% royalty option, the delivery charge is 15 cents per MB of data, and the minimum price you can set is 99 cents. The only way to get Amazon to deliver your ebook for free is to select their 35% royalty option, and then invoke this "matching Competitor Pricing" clause:
From time to time your book may be made available through other sales channels as part of a free promotion. It is important that Digital Books made available through the Program have promotions that are on par with free promotions of the same book in another sales channel. Therefore, if your Digital Book is available through another sales channel for free, we may also make it available for free. If we match a free promotion of your Digital Book somewhere else, your Royalty during that promotion will be zero. (Unlike under the 70% Royalty Option, if we match a price for your Digital Book that is above zero, it won't change the calculation of your Royalties indicated in C. above.)
Apple, Kobo, and Google are much happier to set prices to zero, because they make some money on hardware sales or advertising, so you can get Amazon to give your ebook away for free by getting people to report your zero price on Apple, Kobo, and Google.

So just to get your ebook to be free on Kindle, you're forced to be incompletely honest with your customers and distributers.

But Amazon creates a great temptation. Why not use the suckers paying on Kindle to subsidize the free availability for those smart users who come to  your website? Isn't it convenience that these people are happily paying for?

And libraries are another temptation. They'll pay for the convenience of getting your ebook though their preferred platform, Overdrive or whatever, even as you offer the book for free to users at all the libraries that don't pay for your ebook. But would they still buy if they knew they could get the ebook for free? Maybe you shouldn't ask questions when you don't want to know the answer.

So here's my simple, unproven postulate: in the long run, full disclosure about pricing and an honest relationship with readers will be in the best, mutual interests of authors, publishers, readers, and libraries. And customers will prefer a distribution channel that enables that honesty.

Stop laughing.