Posts Tagged ‘jcdl2009’

JCDL’s final (derailed) plenary, revisited

Tuesday, July 21st, 2009

With this news of Amazon deleting books that were on people’s Kindles (thanks to Lori Ayre’s FaceBook link), Mike Lesk’s assertion that we should be talking about Amazon and not Google takes on more interest. Beyond the question of paper vs pixels, there is still the question of cultural information easily made visible, invisible, or changed by a for profit agent.

Deleting books from the Kindle is obvious: what about revisions?

Kindle and libraries

Friday, June 26th, 2009

If you’re reading this, it’s likely that you’ve been talking with a friend and they’ve been enthusiastic about something they’ve watched, or heard, or read and they’ve offered to lend you the video tape, DVD, record, audio tape, CD, or book. The First sale doctrine or Right of first sale allows your friend to loan or sell a purchased copyrighted work.[1]

It’s quite likely you’ve watched or listened to their media on their equipment — watched a DVD, listened to an album. With recent movies on DVD, there is a clear statement that this sort of sharing is to be private, not a public performance. In the music world, if you’re a public space like a coffee house, you need to acquire the public performance rights (most “easily” through ASCAP or BMI).

So, from experience, it would seem that loaning an ipod with music on it or a Kindle with books on it should be the same as loaning a vcr player and video tapes (something i remember an academic library being willing to do in my past). That’s what Howe Library in Hanover, NH thought. Library Journal reports mixed feedback from Amazon when discussing certain issues with Amazon: is it permitted or not? LJ cites the “the potential ambiguity of the Terms of Service, which bar a user who wishes to “sell, rent, lease, distribute, broadcast, sublicense or otherwise assign any rights to the Digital Content or any portion of it to any third party.”

This brings up one of the threads from the end on of the last Plenary panel session at JCDL 2009: Gretchen Hoffman’s questions of licensed content vs copyright, about how contract law on top of the copyright basis creates ambiguous terms of service and the problem of too many different rules and individuals not knowing when they’re doing something unreasonable.

[1] Wikipedia adds the note, “The doctrine of first sale does not include renting and leasing phonorecords and certain types of computer software, although private nonprofit archives and libraries are allowed to lend these items if a notice that the work may be copyrighted is on the copy.”

JCDL Tweets: The Visualizations

Friday, June 19th, 2009

One of the recurring topics at JCDL 2009 was the question of how to archive participatory media. Twitter was mentioned, as well as Facebook and virtual worlds. It seemed easy enough to collect tweets for the limited case of JCDL & JCDL2009 mentions: once i did that it occurred to me there were some simple questions that could be answered.

The data gathering and extraction is described in the previous post , and it’s available to play with in a topic center at Many Eyes. Since it’s a topic center, i think anyone can add more social media data and analysis. There’s more networking data regarding twitter and the interconnections that could be extracted (although one can’t see how the whole network changed with the conference, i don’t think. There’s no API function to get friends & followers at a specific point in time). While @dchud already posted visualizations of mallet keyword extractions (in tweeted discussion with @band), that might be interesting to add to Many Eyes, along with text of blog posts….

For @HCIR_GeneG and @gingdottwit, top Tweeple at JCDL, the question of how the behavior split — reporter, disseminater, commentator — was raised. I suppose a reporter would not mention another twitter handle, a commentator might, a disseminator would be someone who retweeted, and then there are replies. The data file to play with is Types of Tweets, by handle.

Here’s a few example visualizations:


The familiar Wordle

JCDLTweetWordle

A network graph

JCDLTweetNetwork

A phrase net

JCDLTwitterPhraseNet

The full collection of atom format tweets and data extraction is here, the BSD licensed gawk script here.

JCDL Tweets: The Data

Friday, June 19th, 2009

This is a preliminary post about a simple data analysis of JCDL tweets.

At 12:48 pm PDT on Friday 19 June, i ran the command

curl http://search.twitter.com/search.atom\?q=jcdl+OR+jcdl2009\&rpp=100 >\
TweetsJCDL01.xml

iterating until i ran out of tweets, and thus producing five Atom format files of JCDL tweets.

I wrote a simple gawk script (it’s not JCDL specific) which produced the following data files, suitable for visualization at Many Eyes:

AuthorTweetCount.html - see below
Tweets.txt - Tweet text
CleanTweets.txt - Tweets without handles & tags
HandleTweetCount.txt - Count of tweets, replies, retweets, & mentions
Handle_Cited.txt - handle & the handle of a RT'ed user
Handle_Mentioned.txt - handle & any handles mentioned in the tweet
Handle_RepliedTo.txt - handle & the replied to handle
Handle_Epoch.txt  - handle and the tweet timestamp in epoch

The script will be available, for what it’s worth, with a BSD license, which i hope means that it can be improved freely and made far more useful.

Once i confirm the visualizations work, i’ll post about the results, and i’ll also provide the tweets in atom format, the gawk script, the results, and links to the places to play at Many Eyes.

The most preliminary result is below the cut: a list of all the twitter users who had tweets about JCDL or JCDL2009, ranked.

(more…)

JCDL 2009: Wrap Up

Friday, June 19th, 2009

This was the first conference where i posted all my notes in a timely fashion. I still have notes from this conference season sitting around to be posted: the copyright conference will be posted, for certain. MarsEdit was a help: it was my first time using a tool that allowed offline writing (other than general text editor) and it certainly helped. The pervasive power and wireless also made a significant difference: the lack of wireless at the Vancouver JCDL 2007 still rankles. Daniel Tunkelang (@dtunkelang ) is collecting other live blog sources at this post on The Noisy Channel.

And, OH!, having the proceedings on a USB drive: delightful!

Twitter, too, was a delight, although this may have been because it was a small crowd and not overwhelming. I didn’t interact as much as i’d like because i was madly blogging. Today i read back through the tweets and see interesting threads: I wish there were ways to comment directly back. I used TweetDeck for the first time (finding, to my horror, some direct messages from some weeks ago). Some folks tried live twittering the conference: Gene Golovchinsky (@HCIR_GeneG), whose text i quoted once or twice when he quoted verbatim and i simply caught the concept, writes about his experience here. I note that he had a hard time tweeting Cathy Marshall’s talk,
No Bull, No Spin: a comparison of tags with other forms of user metadata. I had concluded, “Live blogging Cathy’s narratives is very hard because she is such a good speaker.”

I did think about my preference for asynchronous communications as i tried to interact with the Second Life presentation of the poster sessions. [Screen shots, SL URL address, announcement PDF] Since next year’s JCDL is Australia, maybe i’ll practice my SL social skills to be prepared to participate at a distance. The Second Life poster demo was awarded third place in the Best Poster/Demo Awards: all the awards are listed on the conference site and i’ve annotated my notes on the Best Paper session with the award winners.

As a final note, here’s the “social media venn diagram” Frank McCown used in his “What happens when Facebook is Gone?” talk. It’s available on a t-shirt.

JCDL 2009: Plenary Panel 2: Google as Library, Redux

Thursday, June 18th, 2009

Mike Lesk essentially derailed the topic from Google to the Kindle and digital books. This made me sad, even though i’ve been reading books on screen for pleasure since ‘96 or so. The convergence questions (digital books being read, etc, etc) did became copyright related. (Hoorah)

Next conference is in … OMG Australia! The Gold Cost. JCDL and ICADL. A Joint joint conference! (Wondering if there’s a chance i’ll get to go.)

Brisbane is an international airport, with temps 10-20°… centigrade, a location for those into nature, hunting, and hedonism.

(more…)

JCDL 2009: Session 12 (Archives & Preservation)

Thursday, June 18th, 2009

The first full paper was a somewhat technical paper about mapping METS structures to the requirements of OAI-ORE in order to support resource interoperability (one of the key words in the first plenary!) across the many different community standards. (A call out for Merrilee)

There were then a couple papers on web archiving issues on the scale of the Internet Archive, two about modeling the problems in digital preservation, and finally a report on a new virtual organization to discuss archived of scientific data.

(more…)

JCDL 2009: Session 9: short talks

Wednesday, June 17th, 2009

The short talks attracted a standing room only crowd, although the twitter chatter seemed to all be in the other talk (or maybe @HCIR_GeneG was taking notes with twitter).

The first talk announced a coming firefox plugin which will allow folks to self archive their facebook data. (Ah, hope for all the “mail” messages which sit there, independent of my mail store.)

The second described an interesting use of a coming genealogical database, where the API helps in the unique tagging of individuals in the Historical Journal project. (Location tagging supported; search to come.)

The next two addressed a much deeper cut in history: collecting the fragments of texts known only through the quotations in ancient and medieval texts and imaging a particularly warped and rebound vellum codex.

(more…)

JCDL 2009: Plenary: Session 8: Best paper finalists

Wednesday, June 17th, 2009

A total snarky aside: the first two papers were on Mac systems, the last presentation was on a Windows machine. Tech difficulties, security error, much ribbing.

These three were very engaging. The first was a challenge to Zen and the Art of Motorcycle Maintenance in that the paper proposes a method for assigning dimensions to quality, and then surrogate indicators for high quality in those dimension. The second was about OCR and effective heuristics for a three system alignment problem. It was a little less immediately engaging, but impressive. Finally, Cathy Marshall, whom i remember from excellent UX talks at previous JCDLS, pushed some tagging buttons by putting forward a study that lead her to conclude, as generously as she could, that, “tags don’t have the folksonomc power that people say they do.”

Edited to add: the winners of awards were announced that night:

Vannevar Bush Best Paper Award: Automatically Characterizing Resource Quality for Educational Digital Libraries. Steven Bethard, Philipp Wetzler, Kirsten Butcher, James H. Martin, and Tamara Sumner.

Best Student Paper Award: Improving Optical Character Recognition through Efficient Multiple System Alignment. William B. Lund and Eric K. Ringger

(more…)

JCDL 2009: Session 6: Best Paper Nominees 2

Wednesday, June 17th, 2009

I couldn’t decide on which session to attend: i choose based on crowd sourcing (largest room) and expert recommendation (recognizing some of the others in the audience).

The first talk on Chinese character calligraphy was interesting just on the basis of learning more about Chinese characters, while the process was also interesting.

The next talk was on ranking in metasearch. This seemed quite practical, although the (speed) performance issue seems to still be outstanding.

I regret i couldn’t get engaged in the tagging talk. The research seemed oddly confused between user interface testing and testing the questions of tagging and controlled vocabularies. I find myself wondering how a change in the interface (which was a specific application designed for tagging some corpus in this specific experiment) would have changed some of the comments about the process. It’s not clear to me that the results can be decoupled from issues with interface.

The fourth talk also addressed metadata issues, rejecting both cataloging and tagging for the problems in each, and using analysis of spidered book reviews to extract the keywords for metadata. This method echoes some of Session 2’s papers, in particular “Using Web Information for Author Name Disambiguation” and “Finding Topic Trends in Digital Libraries.”

Twitter comments during the sessions discussed Mallet and introduced me to a paste bot. (See the graphically rendered mallet output.)

Even less relevant, i grew to utterly loathe, loathe, loathe “top sites” in the new Safari browser. I like the visual history search, but the updated delay in presenting the top sites is driving me nuts. If i can’t turn it off, i’m going to populate it with a bunch of blank pages from localhost. [ETA: it's a preference setting. Back to empty page.]

As a note for myself, search CADAL for ? (tea).

(more…)