SemTechBiz 2012


I attended SemTechBiz 2012 in San Francisco last week. This annual conference on semantic technology, which is in its eight year, does a nice job in balancing the interests of research vs. commercial communities. This year the conference was tilted towards commercial vendor interests after all the vendors do sponsor the event although the product pitches were confined to a clearly identified solutions track. Here are my semantic annotations about this semantic technology conference.

Given our focus on open source platforms, I enjoyed the session on wikis and semantics. In this session, Joel Natividad of Ontodia gave an overview of NYFacets - a crowd knowing solution built with Semantic Mediawiki. Ontodia's site won the NYC BigApps - a contest started by Bloomberg as part of his grand plan to make NYC the capital of the digital world. NYFacets has a semantic data dictionary with 3.5M facts. Ontodia's vision is to socialize data conversations about data, and eventually build NYCpedia. I wondered why public libraries don't take this idea and run with it: Bostonpedia by Boston Public Library, Concordpedia by Concord Public Library and so on.

Stephen Larson gave an overview of NeuroLex - a periodic table of elements for neuroscience built with SMW under the NIF program. They built a master table of neurons and exposed as a SPARQL end point with rows consisting of 270 neuron classes, and columns consisting of 30 properties. NeuroLex demonstrates the value of a shared ontology for neuroscience by representing knowledge in a machine understandable form.

In the session - Wikipedia’s Next Big Thing, Denny Vrandecic, Wikimedia Deutschland gave an overview of Wiki Data project, which addresses the manual maintenance deficiencies of Wikipedia by bringing a number of the Semantic Mediawiki features to its fold. For instance, all info boxes in Wikipedia will become a semantic form stored in a central repository eliminating the need for maintaining the same content duplicated on many pages of Wikipedia. Semantic search capability will also come to Wikipedia to the applause of folks who maintain Wikipedia list of lists, list of lists of lists by replacing these manually maintained huge lists with a single semantic query. One of the novelties of Wikidata that it will be a secondary database of referenced sources for every fact. For instance, if one source says the population is 4.5M while another says 4,449,000, each source will be listed in the database, thus enabling a belief based inference.

It was nice to see several evangelists of linked data from the government sector at the conference. Dennis Wisnosky, and Jonathan Underly of the U.S. Department of Defense gave a nice overview of EIW Enterprise Information Web. It was refreshing to hear that DoD is looking at linked data as a cost reduction driver. Given the Cloud First mandate of the Defense Authorization Act 2012, the importance of semantic technology in the government will accelerate. In another session, Steve Harris of Garlik, now part of Experian gave an overview of Garlik DataPatrol - a semantic store of fraudulent activities for finance. I could not help wonder if someone from the Department of Homeland Defense was in attendance to hear the details of this application. Steve found no need for complex ontologies, reasoning, and NLP in this large scale application, which records about 400M instances of personal information (e.g. Social Security Number mentioned an IRC channel) every day.

Matthew Perry, and Xavier Lopez of Oracle gave an overview of OGC GeoSPARQL Standard, which aims to support representing and querying geospatial data on the Semantic Web. GeoSPARQL defines a vocabulary such as union, intersection, buffer, polygon, line, point for representing geospatial data in RDF, and it defines an extension to the SPARQL query language for processing geospatial data using distance, buffer, convex hull, intersection, union,envelope, and boundary functions.

Linked data being essentially about the plumbing of semantic infrastructure, it is hard to give engaging presentations on this topic. Two presentations bucked this trend. The presentation by Mateja Verlic from the Slovenian startup Zemanta rocked. Zemanta developed a DBpedia extension - LODGrefine for Google Refine under the LOD2 program. Google Refine supports large transformations of open source data sources, and LODGrefine exposes Refine results as a SPARQL endpoint. Mateja managed to give two impressive live demoes in ten minutes. The other rock star presentation was by Bart van Leeuwen - a professional firefighter, on Real-time Emergency Response Using Semantic Web Technology. Everyone in attendance got the gist of how FIREbrary - a linked data library for fire response, can help firefighters in the real world with a presentation sprinkled with live videos of fire emergency responses. It was instructive to see how semantic technology can make a difference in managing extreme events such as a chemical fire as there are no plans by definition for these types of events.

Bringing good user interface design practices to linked data enabled applications was another theme of the conference. Christian Doegl of Uma gave a demo of Semantic Skin, which is a whole wall interactive visualization driven by semantics. Siemens used it to build an identity map of their company. It uses Intel Audience Impression Metrics Suite to detect the age gender, etc. of the person walking in front of the wall for personalization of content driven by semantics. Pretty cool stuff.

Military Logistics Summit

We attended IDGA’s Military Logistics Summit held on June 8-10, 2009 in Vienna, VA. The focus of this year's summit is to support major deployment, re-deployment, and distribution operations. Milcord's presentation entitled Risk-Based Route Planning for Sense and Respond Logistics for the Military Logistics University covered the technology behind our Adaptive Risk-based Convoy Route Planning solution. Our presentation had a diverse audience ranging from logistics contractors in Pakistan to Logisticians at large System Integrators, from high level US Army officers to academic researchers. A logistics contractor posed the question: "I love your risk based route planning system. I wish we had a system like this. Most logistics material are carried by private subcontractors like us (under contract to a Prime like Mersk) in Pakistan and Afghanistan. Even if the Army has this system, it won't do us any good." It was an interesting question that shined a light on the lack of information sharing between DoD and second /third tier military contractors in the supply chain, and generated a nice discussion among attendees.

Another interesting question on our presentation was the concern about the predictability of a route. Minimal distance routes are deterministic and pose a security risk because they can easily be determined by the adversary. In contrast, minimal risk route is not deterministic (changes with events on the field), which gives a better protection against predictability by the adversary. The risk surface (computed per road segment) changes with every incident, intel report, weather, traffic, etc., which, in turn, affects the route minimal risk route.

Another question: "If a bridge is blown down the road, how long does it take the Urban Resolve data set to update itself? " This is an issue that even commercial COTS GPS tools struggle with random events like road closings due to construction. Our current solution gives a manual workaround for such conditions by letting the user define an intermediate way point and  dragging the route away from the bridge. Crowd-sourcing can also help address this issue by arming users with power to dynamically update road availability by adding road blocks on their GPS units.  Crowd sourcing also brings about data integrity issues in that user specified changes would not be put into the database as every soldier would have a different viewpoint.

There were several other interesting presentations and exhibitions. Dr. Irene Petrick's talk on Digital Natives and 4'th Generation Warfare generated an active interaction with the audience.  She presented survey results that compare the value systems of Traditionals, Baby Boomers, Gen X and Gen Y, articulated where Digital Natives can add value to warfighting, and pose challenges organizational management. On the gadget front, Safe Ports demoed an eye scanner  based on infrared so it even recognizes you through your sun glasses.