Looking for a postdoc in software / data curation


The UC Berkeley Library is offering a two-year post-doc for a promising applied scholar to work on software and data curation issues, possibly with a focus on social network data. We especially are interested in advancing our understanding and support for organizing, preparing, and preserving — for re-use — data from open source software projects, including the code with all of its gnarly revision and forking history, the documentation, check-in remarks and other metadata, and the communications of the social network that often is layered on top of the software development platform. Think of Github as a canonical example.

The rich history of cooperative and sometimes collaborative development is somewhat available to researchers as long as the platform remains open and supported. However, not only does that not protect these rich data for future researchers, but the data are typically not organized, encoded, marked-up or otherwise curated for re-use by scholars.

We had a post-doc open last year, through the CLIR program, but there were too few applicants to fill the handful of similar CLIR-supported positions. PLEASE let promising post-doc candidates know about this, and circulate widely to any friends and colleagues who might have candidates.

Though employed in the Library, the post-doc will be co-advised by a Berkeley faculty member (or team); depending on interest it could be someone affiliated with the Berkeley Institute for Data Science, the Computer Science Department, or any number of other units. Given Berkeley’s leadership in data science, information science and computer science, this position is a wonderful opportunity.

Please employ the CLIR Postdoctoral application program to apply (http://www.clir.org/fellowships/postdoc/applicants/dc-science). Applicants seeking more information specifically about the position at UC Berkeley can get in touch with Erik Mitchell, Associate University Librarian (emitchell (at) berkeley (dot) edu).

Authors, liberate your books


The Authors Alliance is a professional (non-profit) association of authors started by four UC Berkeley faculty.  Its mission is to “further the public interest in facilitating widespread access to works of authorship by assisting and representing authors who want to disseminate knowledge and products of the imagination broadly.”  It exists in part to support the interests of authors who don’t primarily depend on royalties for their livelihood, and who wish the widest readership possible for their works — characteristics that describe most academic authors, for example.

One project of the Alliance is Rights Reversion, to support authors who want to recapture their rights to works that are out-of-print or that are no longer selling many copies, by helping them re-obtain their distribution rights from publishers, after which they might release their works under an open access license (e.g., Creative Commons) and publish to an open access web portal.

This year the reversion project released a guide, Understanding Rights Reversion, and has seen several notable successes, including the public release of books by Robert Darnton, John Kingdon and Jeff Hecht.

Why I give money to my employer


It is often said that there is no great university without a great library.  My wife and I believe that.  And here at Berkeley we are very fortunate to have one of the truly great academic research libraries.  We maintain and develop amazing collections, provide access to the world’s growing information resources, provide instruction to thousands of students, consultation and support to one of the finest research faculties in the world, and serve the public.  We have a great mission, helping people find, evaluate and use information to improve their lives and better the world.

We are also passionate about public higher education.  Public education at scale has opened the doors to a college education to millions of students who could not afford elite private universities, or who preferred the advantages offered by public campuses.

Jeff MacKie-Mason 2

Very sadly, the meaning of “public” higher education has been changing: the public has decided to disinvest.  Despite getting only 13% of our operating budget from the state — typical for public research universities — Berkeley is deeply committed to it’s public mission.  But the only way to continue being the best public institution in the world is if those of us who believe dig deep and contribute.

Indeed, to provide the great library resources and services we offer requires that we spend a lot on a talented workforce (we have about 450 FTE to provide these services and staff our one million square feet of space and 53 miles of stacks).  And we have to spend a lot on acquiring, processing, and preserving resources and tools for using them.

Berkeley writes my paycheck.  But Janet and I believe in its mission, and so we write checks back.  The only way forward for high quality public education is for those who believe to donate.



Can we afford privacy from surveillance? Do we want to?


A couple of weeks before I started my position as University Librarian, the UC Berkeley School of Information invited me to give a talk on the future of individual privacy; here is a video of that talk.  Last week, nationally-syndicated radio show host Katherine Albrecht interviewed me on this topic for about 45 minutes; here is an MP3 of the show (with many commercials, I’m afraid).

In short, I think the economics of surveillance and protection from surveillance are leading inexorably to a not-very-distant future of radical transparency, at least for any information about us that is captured and stored on digital, networked-computers (which is more and more all the time, and will be even more when the Internet of Things really takes off).  I don’t see an alternative: we get to much value from selective revelation of information about ourselves, value that will be increasing as we learn better ways to network and use that information.  And the costs of capturing networked information are going down faster than the costs of protecting ourselves, and I think this is a technologically unavoidable fact driven by the nature of selective revelation in a networked world.

Relevance for libraries?  You might be thinking, “libraries have strong policies to protect the privacy of their users information.”  Yes…sort of.  First, policies are themselves a technology, and they are costly to enforce.  How good is our security against data breaches?  Better than at the IRS, or at JP Morgan Bank?  How fast are our budgets for security growing?

Another issue: to provide our users with access to the rapidly expanding networked stores of information, we provide them with access to an ever increasing array of third-party tools and databases.  What sort of privacy protections do we have on how those third-parties protect our users’ privacy?  Do we have contractual provisions with all of them? No.  (Can you spell “Google”?)  And contractual provisions are another type of policy, that needs ever-increasingly expensive enforcement, whether it be cybersecurity against external attacks, or protection against unscrupulous employees who might sell access.

Do universities need libraries? Isn’t Google free?


On Friday I gave a lunchtime talk to the UC Berkeley Foundation (about 100 people — mostly alumni donors — who manage and lead fundraising on behalf of the campus). I offered an historical framing of just how significant the digital information revolution is going to be (so much more so than we’ve already seen), and why that means we need information professionals more than ever.

I posted the slides to SlideShare, but I tend to use just a few slides to illustrate my points while talking, so I posted them with the script included.

Here is the abstract:

The Gutenberg revolution was an enabler and shaper of the Protestant Reformation, the Renaissance, and the Scientific Revolution. It did so through a small, simple technological advance: merely a reduction in cost and increase in accuracy for information reproduction. But from that modest technological change, one-to-many communication became practical.

The digital revolution accomplished the same feat, only more so: the incremental cost of information reproduction is now about zero; reproduction accuracy is about perfect. And a new impact: information distribution is instant. These are even greater transformations than the Gutenberg press, which enabled and shaped the Protestant Revolution, the Renaissance and the Scientific Revolution. The impacts on civilization of the digital revolution over the coming decades and centuries will be even greater.

Though information is now abundant, finding, evaluating, making sense of and using good information is harder. In the Information Age, we need librarians and other information professionals more than ever.

Looking Forward


People walking through wooden maze indoors

[I sent this message to all library staff today, my first on the job as University Librarian at UC Berkeley (slightly edited to remove some local info).]

Our campus is one of the top research universities in the world. To support our campus, our library should be no less.

What do we need to accomplish?

The Commission on the Future of the UC Berkeley Library concluded in 2013 that “the most important contribution of the research university library in the next twenty years will be to provide the increasingly sophisticated human expertise required to successfully navigate [the] rapidly shifting heterogeneous terrain” [emphasis added]. I agree.

Every well-educated, successful participant in the modern world needs to be information fluent. With the explosive and low-cost abundance of information resources, everyone needs to be their own librarian, finding and evaluating information all day long, every day. That doesn’t mean there is no role for us: just the opposite. First, we are needed to help our students and faculty become information fluent. But even though every college graduate can write, we still need professional writers in our society. And even when we elevate all of our students and faculty to more advanced levels of information fluency, our second responsibility will be as expert informationists, working with them to solve their more challenging and advanced information problems in this “rapidly shifting heterogenous terrain.”

The rapidly growing availability of new data, and computational tools to analyze them, provides one of these emerging challenges for many. Most of our faculty and students are not yet trained or equipped to provide for all of their own data curation, management and preservation needs. As the core provider of advanced information service on campus, we must further develop our expertise and provide an ever stronger set of data services to campus. We won’t provide all data services by ourselves of course, but will create a complete portfolio through partnerships with other campus units.

The University’s mission is to “discover knowledge and to disseminate it to its students and to society at large.” To support this, we must make it easier for scholars to engage in open access dissemination. At the same time, we have to work to create a more financially sustainable publishing ecosystem so that we can afford to provide access to scholarship at the same time as we deliver our other services. Our faculty and students will benefit from more open and lower cost dissemination, and the whole world will benefit from greater access to Berkeley’s discoveries.

We have many great spaces for contemplation and study. But the ways are changing in which students and faculty engage with information, and with each other as they use information to advance learning and discovery. We need to re-envision and re-design many of our spaces to support an age of interactive, connected and collaborative learning and discovery. For example, students need more sophisticated, powerful access to local and remote information collections, and to use new technologies to find, evaluate and use this information. We can provide them with access, training and experience to prepare them for their future. Equally or more important, students need access to each other — face to face and virtually — to engage inconnected, collaborative learning, discovery and knowledge production. We should provide spaces, technologies and human expertise to make our libraries the vibrant, go-to campus hubs for connected learning. The just-initiated renovation of Moffitt 4 and 5 is a first step in this initiative.

None of this is to say that we will abandon our collections, nor that we will stop building them. But our mission is not building collections for their own sake, but helping people to find, evaluate and use information. In some cases the most important way to help is to continue to build and preserve our tremendous collections of print and physical and digital objects, and we will devote considerable effort and resources to doing so. In other situations we should focus on providing advanced services to help people access, evaluate and use digital resources owned and stored elsewhere.

More than anything else we must provide professional, advanced service. Our people — you — are our most valuable resource.

We are at the dawn of the second Gutenberg age, with information production and dissemination growing explosively. Societies have evolved from primarily agricultural, to industrial, to service-based, and now we are entering the Information Age. At this time society needs ever greater knowledge institutions. What an exciting time for a university research library.

For me, this isn’t a job, it’s a passion, and I’m on a mission. Let’s make great things happen together.