Archive for the ‘old newspapers’ Category

Horizon scanning and the digital underbelly

March 29th, 2009

Gaynor Backhouse writes a great post about libraries, holding out for “a guided tour of the library’s digital underbelly”. My favorite part is her metaphor about horizon-scanning:

Horizon scanning is a bit like doing a jigsaw you’ve bought from a car boot sale: first of all, it comes in a plastic bag, so there’s no picture to guide you. Secondly, you can see from the myriad sizes of the different pieces that there’s more than one puzzle in there and, thirdly, you know, even as you are handing over your money, that you won’t have all the pieces to complete any one, particular puzzle. [JISC Libraries of the Future | Holding out for a hero: technology, the future and the renaissance of the university librarian.]

Gaynor manages JISC’s TechWatch, keeping up with tech trends for libraries.

I’m not quite sure what the library’s “digital underbelly” is. But this sampling of news art strikes me as one possible example.

Graphics section of the Chicago Tribune, September 9, 1938

Graphics section of the Chicago Tribune, September 9, 1938

The Art the Message: The Story Behind the Chicago Tribune Collection has the same feel of the behind-the-scenes tour Gaynor Backhouse described: “secret stuff” that only the curators know about. This collection was saved by Janet A. Ginsburg, who edits news aggregator and curates a collection of news retrospectives, hosted at her personal site.

For access to the physical collection (now known as the Janet A. Ginsburg Chicago Tribune Collection of the Michigan State University News Archive) contact MSU Communication professor Lucinda Davenport. Images from Janet’s news art exhibit can also be seen at Brainpickings and (with Portuguese commentary) at Segunda Língua. Found via Janet’s comment on Steven Berlin Johnson’s SXSW talk, Old Growth Media And The Future Of News.

Tags: , , , , ,
Posted in library and information science, old newspapers | Comments (0)

Leah B. Allen: A Woman in Astronomy

November 1st, 2008

I made Leah Allen a wikipedia page after skimming a 1908 San Francisco Call.

“Halley’s Comet after 75 years rushes Earthward again” shares page 2 of the Sunday August 23, 1908 San Francisco Call with “Fattening Properties of the Potato”. The article starts by discussing E.E. Barnard‘s ambitions for photographing Halley’s Comet. The second half of the article announces the appointment of Leah B. Allen to the post of Carnegie assistant at the Lick observatory. The Brown graduate, who later became professor of astronomy at Hood College, is described as “a pretty Providence girl” who “always sailed her own sailboat” and “has a delightful personality”.

100 years ago, such a description raised no eyebrows. What shocks me is that the article was written by another woman, Mary Proctor, who wrote popular books about astronomy. It’s hard to imagine what life as a woman in astronomy was like, in the early 1900’s.

Especially since women were not particularly new to the field: Maria Mitchell (1818-1889) is generally known as the first professional woman astronomer in the United States.

Tags: , , , , , , , ,
Posted in old newspapers | Comments (0)

NYTimes Topics: Quirky, Useful Classification, Finding Aid

October 23rd, 2008

Yesterday the NYTimes announced a new API, TimesTags, “based on the taxonomy and controlled vocabulary used by Times indexers since 1851”. The browseable version of this vocabulary, , is a great entry into NYTimes articles published since 1981.

NYTimes Topics

NYTimes Topics

Ed Summers did some scraping while also asking the Open NYTimes team for a SKOS version. Meanwhile, I’m playing around with the classification (online and scraped). Its quirks seem to reflect how it’s been used, and how it has evolved over time. Classification systems can highlight the material classified; they also tend to give insight into the worldview of the people classifying materials or creating the system. The interplay makes integration of classification systems, such as through topic maps, an interesting research area. But that’s a topic for another day.

Here are some things I’ve noticed while playing around with the vocabulary.

Overall Structure

The NYTimes’ main navigation lists 15 sections. The NYTimes taxonomy has 3 top-level categories: news, opinion, and reference. 7 sections fit within the news taxonomy. Opinion has its own category. Travel is an explicit subject within the reference category. Technology, arts, and style are topical, drawing primarily on the reference category. (Cooking, however, is similar to travel in its treatment.) The 3 advertising sections (jobs, real estate, and auto) are already classified, and thus, out of scope.

The remaining 7 sections we dub “news”. Here are examples of taxonomy terms, showing the category structure:


  1. World: international/countriesandterritories
  2. U.S.: national/usstatesterritoriesandpossessions/
  3. N.Y. / Region: newyork, newyorkregion
    nyregion and newyorkandregion are both used, but they are not interchangeable (in the sense that there aren’t redirects)
  4. Business: business/companies
  5. Science: science/topics

  6. Health: health/diseasesconditionsandhealthtopics
    As the name (diseases, conditions, health topics) suggests, this encompasses a wide range of topics: particular drugs such as Ritalin, categories of drugs such as antibiotics, topics such as smoking, sleep, teenage pregancy, and twins, and professional groups such as surgery and surgeons.
  7. Sports: sports, olympics

    Beyond sports, subcategory names vary considerably. Other sections, such as for the Olympics, are outside the main hierarchy:

Opinion: opinion
Again, beyond opinion, there is variation. However, editiorialsandoped is the main subcategory.

Reference: reference

is handled as a subject:

Spelling Discrepancies

Drugs (Pharmaceuticals) has two spellings: drugs_pharmaceuticals and drugspharmaceuticals are aliases.

E TRADE Financial Corporation and E*Trade Financial Corporation, however, appears to be an error: they have some data in common, and other data not in common. Either an error or a bizarre story behind that.

Differences in usage

Where to put recipes

Apples is a subcategory of cooking (e.g. apples):

Perhaps because apples tend to be used as a cultural reference? Still, where do apple recipes belong?

Pumpkins, on the other hand,  has a subcategory for recipes:

Dogs are in science, but fossils are not

While most subjects are classified only alphabetically, there are exceptions. Compare fossils to dogs.
Fossils is a plain-old subject, (subjects/f):

Dogs, however, is a science topic, (news/science/topics):
I wonder if that’s because dogs are a more common subject than fossils?

Saying what you mean

Disambiguation, eh? Here, shrimp is a topic within science, so don’t expect recipes (except in the ads):

Category structure

Prominent subtopics

Subtopics are sometimes listed at the top level. For instance United States Attorneys seems to contain United States Attorneys: Editorials & Opinion. Both are listed at the top of the topics tree.

I find it fascinating that Cookies and Cookies, Recipes are separate topics. Again, culturally justified.

Depth of categories

There may be several levels of subcategories, e.g.

Mixing of keyword and controlled terminology

I’m surprised to find “hot dogs” as the top two “articles about dogs”, after some nice featured content. NYTimes may also want to refine handling of multiword terms.

Hot dogs turn up in dogs

Hot dogs turn up in dogs

Another example is “Baby Quasar(Skin Care Devise)” showing up under quasars.

By versus About

Times writers (e.g. Tom Zeller Jr.) are listed in italics and classified as people. The ‘by’ versus ‘about’ distinction is made primarily in meta tags. “PSST” seems to identify Times writers.For instance, compare the meta tags from Tom Zeller Jr’s page:

<meta name=”PT” content=”Topic” />
<meta name=”CG” content=”Times Topics” />
<meta name=”GTN” content=”Zeller, Tom Jr.” />
<meta name=”PST” content=”People” />
<meta name=”PSST” content=”Writer” />

to those on (non-Times) writer Toni Morrison’s page:

<meta name=”PT” content=”Topic” />
<meta name=”CG” content=”Times Topics” />
<meta name=”GTN” content=”Morrison, Toni” />
<meta name=”PST” content=”People” />
<meta name=”SCG” content=”The Public Editor” />

Final thoughts

The world of electronic publishing blurs the lines between producers and indexers. Archival content, served up by organization, person, or topic, is a great offering. The secondary publishing market (abstracting, indexing, etc.) is changing quickly. Source-based browsing, as at NYTimes Topics, is part of that change.

Tags: , , , ,
Posted in old newspapers, reviews | Comments (2)

The Girl of the Butterflies

August 23rd, 2008

Every week, I take a look at old newspapers pulled from Chronicling America by Ed Summers’ 100 Years Ago Today feed. Sometimes my gaze is caught by events of the day—telephones, auto accidents, odd notions of gender roles. Just about every week I stare at the magazine section of the front page of The San Francisco Sunday Call*. Often I don’t know how to interpret these—do they really have something to do with the news of the week?

The Girl of the Butterflies, San Francisco Sunday Call, August 23, 1908

The Girl of the Butterflies, San Francisco Sunday Call, August 23, 1908

This week’s cover is particularly beautiful: The Girl of the Butterflies. So many questions arise from one simple image from August 23, 1908: Were there really so many butterflies in San Francisco 100 years ago? Would a woman really go netting in such a costume? What do butterflies have to do with anything? The wonderful thing about peering into the past is that it opens more questions than answers.

*The San Francisco Call wikipedia entry is a good start but needs some work.

Tags: , , , ,
Posted in old newspapers | Comments (1)