Archive for the ‘GPS and GIS’ Category

GIS and genealogical resource: Atlas of Historical County Boundaries

Thursday, July 15th, 2010

Years of initial formation of Georgia countiesMy cartography project, as i finished up my GIS certificate, focussed on the changes of county boundaries in Georgia, where i had grown to be aware that just because someone was born in a certain county at one point, the records might now in be another.

I did look, at that time,look for shapefiles i could use, geospatial data that would show the “dance of county lines.” I could find digitized old maps, but no consistent set of line data.

Today i discover,

“he Newberry Library is pleased to announce the completion and release of its Digital Atlas of Historical County Boundaries, a dataset that covers every day-to-day change in the size, shape, location, name, organization, and attachment of each U.S. county and state from the creation of the first county in 1634 through 2000.

Useful details are :

The data are organized by state and are available online in four versions:

* Viewable, interactive maps (electronic analogues to printed maps) on which the historical lines have been plotted against a background of the modern county network

* Downloadable shapefiles for use in geographic information systems (GIS)

* Downloadable KMZ files for use with Google Earth

* Downloadable and printable PDF files (each full-page frame shows a map of a different version of each county, with the historical boundaries displayed against a background of the modern county network)

Supplementing the polygons and maps for each state are chronologies, commentary on historical problems, long and short metadata documents, and a bibliography.

The project began in 1988, with principal funding provided by the National Endowment for the Humanities, an independent federal agency. Additional support came from the Newberry Library, which also served as headquarters, and from other foundations and individuals. The Newberry Library is the copyright holder; all files of the Digital Atlas of Historical County Boundaries are free for use under an Attribution-NonCommercial-ShareAlike 2.5 Creative Commons License.

For genealogists, the website itself will be a rich resource.

There’s an overview of the history of each and every county: consider this example of Early County, Georgia. To create a link to the overview of the particular county within the document for the state, use the Index of Counties and Equivalents. For a statewide understanding of county boundaries, consider the Consolidated Chronology of State and County Boundaries

I would think that every USGenWeb county editor would want to link to this resource for their county to support their existing county mapping resources.

For cartographers, neocartographers, and genealogists, there is good documentation to help users understand the different resources offered by the site.

For a quick look at the data outside of the interactive map on the website, one can use Google Earth:

Since the KMZ file is time-coded, the Google Earth time slider will automatically appear. This time slider can then be used to view the boundaries at a specific date, or to view an animation of the state’s boundary changes over time. The time slider properties can be adjusted to modify the animation speed, or to view a smaller time span in more detail.

With Google Earth, it is possible to compare the historical county boundaries with geographical features such as rivers, lakes, and mountain ridges. The historical boundaries can also be compared to a large variety of layer information available in Google Earth, such as streets, populated places, and modern administrative boundaries.

I know i have loaded data that began as KML into my GPS unit in the past: i don’t recall whether it was just lines and points or if it has included boundaries. Given a GPS unit that could load boundaries, i can imagine it being very useful to be able to take the historical county boundaries with one while traveling on research.

I hope the Attribution-NonCommercial-ShareAlike 2.5 Creative Commons License will not put off too many folks. I’m passionate about copyright issues, but I am not a lawyer. I’m a fan of the work Creative Commons does to facilitate the rapid dissemination of ideas in our culture, in a manner that matches the technological speed of delivery. Personally, i like the “ShareAlike” clause. To explain how the clause works, if someone took one of my photos of my cats and added captions, they too must offer their captioned image of my photo under the same license. I’ve often simply used Attribution-ShareAlike, because i am not likely to have commercial interests threatened by any use of my offerings. If someone wanted to use my map of Georgia in their book or their blog that has advertisements, they can as long as they offer the image with attribution.

The Attribution and ShareAlike clauses are instructions to the person copying or creating a derivative work; the “noncommercial” clause in this license creates a test. As a reminder: the presence of a Creative Commons License does not remove fair use rights. It’s interesting, however, to see the spectrum of uses between “non-commercial” and fair use.

To start with non-commercial use, consider USGenWeb. USGenWeb does not have ads, does not charge, and explicitly states that it is keeping information freely available for genealogists. The reproduction and redistribution of the historical county boundary line data by USGenWeb is in the clear. Ancestry.com would not pass the non-commercial test, and fair use would never allow reproduction and redistribution of full data sets.

Is non-commercial use always easy to determine?

Creative Commons noncommercial licenses include a definition of commercial use, which precludes use of rights granted for commercial purposes:

… in any manner that is primarily intended for or directed toward commercial advantage or private monetary compensation.

This may seem pretty clear cut, particularly when considering a website that has advertisements on it. But what if the advertisements merely subsidize the cost of the servers? The Creative Commons has published a study where creators and users were asked to classify certain uses as commercial or noncommercial. [PDF and data available from here.] It’s an interesting read. One point they bring up is how differently creators and consumers judge the question above if the website is for a nonprofit entity.

So what of the work of genealogists for hire? A cartographer producing a map for a book? A scholarly paper? It’s worth considering that many uses of this county boundary data by a genealogist or cartographer may easily fall under Fair Use, not the CC license. From Wikipedia:

Fair use is a doctrine in United States copyright law that allows limited use of copyrighted material without requiring permission from the rights holders, such as for commentary, criticism, news reporting, research, teaching or scholarship. It provides for the legal, non-licensed citation or incorporation of copyrighted material in another author’s work under a four-factor balancing test. [See the article for the four part test.]

Some “Fair Use” uses will enable monetary compensation: fair use can be “commercial” in the sense that a reviewer may excerpt from a work in a review for which the reviewer receives monetary compensation. Using a few county boundaries in a map on a website that is supported by advertising may be considered fair use. An analysis of fair use and GIS data can be found here. Despite the saying that it is easier to ask forgiveness than permission, if you were to have any questions, contact the Scholl Center Staff at the Newberry Library. I rather expect that they don’t bite.

Conference Season

Wednesday, March 31st, 2010

I will not be attending the Copyright@300 conference at Berkeley, celebrating the tricentennial of the Statute of Anne, with some regret, but i hope to follow a little of its content in blogs and on twitter. In looking for folks planning to attend Copyright@300 i discovered this interesting blog-post by Peter Hirtle: Factoids: What is the oldest work protected by copyright in the U.S.? What work will have the longest protection?

I will try to make it to another High Tech Law Institute talk near by, though: Professor Paul Ohm discusses Data Anonymization, April 7, 2010 (6:00 PM – 8:00 PM), “What the Surprising Failure of Data Anonymization Means for Law and Policy.” They call it their “Re-identification Panel/Paul Ohm Event.”

Computer scientists have recently undermined our faith in the privacy-protecting power of anonymization, the name for techniques for protecting the privacy of individuals in large databases by deleting information like names and social security numbers. These scientists have demonstrated they can often ‘reidentify’ or ‘deanonymize’ individuals hidden in anonymized data with astonishing ease. By understanding this research, we will realize we have made a mistake, labored beneath a fundamental misunderstanding, which has assured us much less privacy than we have assumed. This mistake pervades nearly every information privacy law, regulation, and debate, yet regulators and legal scholars have paid it scant attention. In this talk, Professor Ohm will discuss what policymakers and lawyers must do to respond to the surprising failure of anonymization.

I must admit, the framing that computer scientists have undermined faith in anonymization bugs me. Perhaps it would be more accurate to report that computer scientists have demonstrated such faith is misplaced.

WhereCamp (#wherecamp) is this weekend, and i may pop by on Saturday and Sunday afternoon. The “oversubscribed” notice is a little confusing, but i trust it will work out. (I registered at Upcoming long ago.) See the WhereCamp Blog as well as @WhereCamp.

I made the earlybird signup for Internet Identity Workshop X (#iiw10) with just hours to spare. (The official twitter list is here.) I realize i have a pressing topic in discussing, “Logout in a SSO world: Shib 2.0’s ‘don’t go there’ vs OpenID expectations.”

I won’t be attending JCDL (#JCDL, #JCDL2010) in Australia, but will be attending InCommon’s CAMP (#campmeet) and advanced CAMP (#acampmeet) in June. More on that, in a few months.

HoudahGeo competitor PhotoLinker

Tuesday, January 26th, 2010

Today’s installment from macmap (Macintosh Mapping and GPS Group) had a notice about Photolinker, a product that sounds a great deal like HoudahGeo. I was delighted with my trial of HoudahGeo when i tried it several years ago and would have bought it this summer if i hadn’t lost all my timestamps on the geotrace from our visit to Mt Lassen this summer.

I have too many projects and am trying, these days, to refrain from buying things until i need them (a half pound of laceweight alpaca yarn over a mile long aside and purchase of cartography software Ortellius aside), so i will wait to compare these two products at a later date.

ACM SIGSPATIAL GIS 2009, from a distance

Saturday, November 7th, 2009

Turning from a conference i (mostly) attended to a conference i had no excuse to attend, this week was the 17th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (ACM SIGSPATIAL GIS 2009). Twitter didn’t reveal much discussion with a few tweets discovered with “sigspatial” and a few others referring to “ACM GIS”. No need to turn to any visualization software. I tried correlating the tweets with the session schedule and will wait to skim the papers when they arrive. (Last year they were delivered on CD.)

I’ve found one paper already, thanks to an encouraging tweet from @snittel, and it is a fabulous cross of geospatial referencing, data mining, and the social web: Twitterstand. Socially, there’s the attempt to identify persons who tweet about breaking news. Then the data mining component comes in using both a static and dynamic training corpus to distinguish news tweets from everything else. Once a tweet is identified as news, it is then added to a topic cluster (another incredibly challenging process for computers). Once a topic cluster is established they need to locate that news. They do this by extracting and locating place names (toponym recognition via natural language processing parts of speech analysis with named-entity recognition and toponym resolution via Geonames gazetteer) from the 140 character tweets and the user’s location. Then all the locations for a topic cluster are weighted and analyzed to geolocate the news cluster, then mapped.

After the cut, tweets and the abstract for the keynote. (more…)

Data conversion issues: GPX to shapefiles (episode 5)

Sunday, September 27th, 2009

So, i spent a little time this weekend remembering how to use Illustrator. MacGPS Pro will make a chart of elevations: i want the look and feel to match the other maps and diagrams i imagine (fantasize) making.

In all that exploring, i found:

1) The pdf export’s elevation line is actually many line segments, too small for using the “text on a path” function.

2) The auto trace function, when applied to a graph, traces the cells of the graph and does not make one continuous trace of the plotted elevation line.

3) When one uses the “text on a path” function, the styling previously applied to the path (a stroke, for example) disappears.

This does count as progress!

I then went to MacGPS Pro and cleaned up a track to document Lassen’s Main Park Road in order to get the elevation profile.

The 29 mile Main Park Road was constructed between 1925 and 1931, just 10 years after Lassen Peak erupted. Near Lassen Peak the road reaches 8512 feet, making it the highest road in the Cascade Mountains. It is not unusual for 40 feet of snow to accumulate on the road near Lake Helen. — NPS

That done, i extracted it and a collection of waymarks in kml and gpx format, in order to look at them in uDIG.

No chance. One can convert the KML (used by Google Earth) to GML (Geography Markup Language, presumably easy, albeit i’d prefer to find someone else’s stylesheet. However, kml export does not include the timestamps. That’s fine for this trip, since i lost the timestamps, but i’d prefer to find a conversion for the GPS exchange format, GPX.

It would be lovely if GPSBabel could help, but it handles the same formats i can export from MacGPS Pro and none that uDig can consume.

One possible solution was based on finding a question in a Quantum GIS (QGIS) forum: i have an old version of QGIS (”Io”) still on the laptop. The GPX files did not display in QGIS, but Google Earth loads the files, just fine. Christine and i suspected that some XML element for standard metadata may be missing and Google Earth just forgives and assumes a standard (like WGS84). Skimming the schema it doesn’t seem like that’s the problem: MacGPS Pro’s export seems to meet the spec. I would submit a QGIS bug report, but i’m using an antique version of the software.

Other solutions i’ve found tonight seem to be windows or linux based. If it truly hacked the open stack, i could possibly hack a solution.

A final side note: Google Earth is one of the applications that stores the user name as part of the path to the configuration files.

Dead end explorations after the cut: (more…)

Santa Clara County Releases Its Geodata

Friday, September 18th, 2009

I’ve written about this case before, where Santa Clara County was charging far more than the cost of reproduction for the parcel geodata and was being sued for violating California’s Public Record Act (PRA).

The data has been finally provided to the California First Amendment Coalition (CFAC) at $3.10 per disk.

There’s a press release/article after the cut.

HOORAH!!! (more…)

Playing with the GPS: Exploring Mt Lassen (episode 4)

Friday, September 18th, 2009

We’re back from Mt Lassen. Photos will come when i can face the fact that i’ve lost pretrip photos. The hard drive i moved them to failed *during* the scheduled back up, which rubs salt in the wound. I also have some data-loss pain (more photos) due to the oh-so-rugged, so new i hadn’t backed it up, Hitachi simpleTOUGH drive failure last month. I am trying to practice detachment, but it’s coming off a bit more like sour grapes. “I never would have had time to really go through those photos anyhow.”

Data loss continues to haunt me (so it’s a good thing i’m not bringing this mojo to work). I knew there was a reason i rarely save off trip logs on the device interface of my Garmin GPSmap 60. I rediscovered the reason as i started looking at the GPS data from the trip: the timestamps all go away. I’ve found notes about this behavior here. I’m very disappointed because i was planning on using HoudahGeo to correlate my photos with our track log.

I’ve done a number of things since being back,and thought i’d checkpoint, since i’ve a new laptop to play with now:

  • i had to strip the values from the GPS data so that i could use the more precise MacElevation California DEMs for altitude values, but now i can do an elevation profile of the trip
  • I scanned in the USGS map of Bumpass Hell and made a first pass at georectifying it: it needs more work.
  • I’ve picked out fonts and created a color palette to use on the map.
  • I’ve been using the trial version of Ortelius from Map Diva. So far it looks very useful, and there is a special offer of $79 through the end of Sept 2009.

    Import .jpg, .bmp, .gif, .png, .pdf, and .tif image graphics, and ESRI map shapefile (limited in Standard edition); export .tif, .jpg, .png, and layered .pdf

    Loaded with a broad selection of royalty-free world, continent, and country map templates to customize and make your own…. Choose from templates, use your own existing (GIS) map data, or make custom-scaled map graphics – you have the flexibility to decide.

(more…)

Mt Lassen Hot Rock mystery (episode 3)

Thursday, September 17th, 2009

While waiting for back-ups to complete then Illustrator to install, i researched the “hot rock” mystery and uploaded a scanned in section of the USGS document: “Maps showing thermal features and topography of Devils Kitchen and Bumpass Hell, Lassen Volcanic National Park, California” to Map Warper.

What exactly do i mean by the “hot rock” mystery?

Lassen Volcanic National Park has two locations that explicitly focus on the effects of the eruption in the 1915. One spot is the Devastated Area interpretive trail, another is the “Hot Rock” pull off. In both places, visitors may read about the photographs BF Loomis took which documented the effects of the eruption. References are made to the black dacite rocks in contrast to the pink and grey dacite: the black dacite is the recent rock formed by the lava that was cooling in 1915. The pink and grey dacite make much of the peak of Lassen and were washed down in the mudflow along with the new black dacite.

Loomis photographed and noted the “hot rock” and there is the “hot rock” at the pull over and the “hot rock” at the interpretive trail. Christine and i became a little confused as we tried to figure out how the “hot rock” could be in two places at once. In fact, and with no shock, there are two different boulders in the “hot rock” photographs.

The smaller hot rock is documented multiple times by Loomis, as in the pair of images reproduced by the USGS: one from May 22, 1915 and the same scene taken by Loomis later that summer. Another USGS page offers the single “hot rock” photo with an 1984 comparison image. (I found a note here which asserts the hot rock “later disintegrated.”)

This “hot rock” is in the Devastated Area.

Hot Rock pull-over in Lassen Volcanic National ParkBoth the pull-over and the Devastated Area have a Loomis image of a group standing before a stone with the annotation of “hot rock” of the dimensions of the stone This postcard is a reproduction of the group portrait, too small to see the expressions on the faces which Christine studied in the large reproduction in the park. Note that Lassen has no steam from the top.

References after the cut. (more…)

Back to GIS: Exploring Mt Lassen via UDIG, episode 2

Friday, September 4th, 2009

I had a nice interlude changing styles on a bunch of data layers and updating a JIRA bug. The data layers were from the National Park Service’s data store: http://science.nature.nps.gov/nrdata/datastore.cfm?ID=35620. Then i went to add some other layers.

Argh, projection woe: “Generic Cartesian 2D.” Indeed, this lovely data set has a warning in the metadata:

These data are small-scale (generally 1:100,000 or smaller). They are intended to be used as a set. The data are not intended to be used with other data sets, particularly larger-scale data, as they may not align spatially.

Christine notes she could georeference some of the layers but, no.

I can use this for identifying quad names and getting a general sense of place, but i’m doing this project for fun. I can refrain from a mashup with other data.

Unfortunately, in a moment when i thought i was going to reproject the layers, i deleted my map with all my styles.

Updating with layer information after the cut. (more…)

Back to GIS: Exploring Mt Lassen via uDIG, episode 1

Monday, August 31st, 2009

I’ve old mapping projects still sitting around but the possibilities that could be met on a road trip to Lassen Volcanic National Park struck me as “urgent.”

I’m going to try uDIG instead of QGIS, because i’m lazy. The QGIS Mac OSX version requires separate installation of dependency frameworks. It seems to have been streamlined, but i’ll start with the easier system.

On my Mac, uDIG Version: 1.2-M6 took a little time starting up. An alert box notified me that files had been written to my home directory and prompted me to restart. There was also an error about a file name with a bad character. The “About uDIG” panel has a link to “Installation details” and that has a link to the log: i was able to review the error more clearly there. I note that this is not a “clean” install, but on top of a 2007 install. Files seemed to go in

./Documents/uDigWorkspace/
./Library/Preferences/net.refractions.udig.plist

I followed through most of the first walkthrough, deleting the “map decorations” from the catalog, by accident. (I wanted to delete them from the map, and should have done so in the layers window.)

Meanwhile, i downloaded USGS data on Lassen and unzipped it to find … no shapefiles, just .ADF. I appear to have run across .ADF before, when struggling with .E00 data. Meanwhile, the Sierra Nevada Ecosystem Project is in the .E00 format.

Suddenly, the fact it’s 9 pm and i’ve not had dinner takes on significance.

(more…)