Frag men tation

This is my last piece directly about the 8th International Semantic Web Conference, though I’ll continue to be inspired by the things I picked up there. My other two pieces were about ontologies and data visualization.

I was sorry to leave ISWC today – and not just because of my fondness of suburban Virginia. For a humanities-oriented undergrad in a crowd of expert scientists and researchers, I learned an astonishing amount in the past few days – certainly enough to spark my interest in continuing to learn and discuss more, if only I could find a corollary to the community I justĀ  met somewhere online.

The problem is that the great discussions I witnessed or joined at ISWC are – online – fragmented into such specialized subgroups that they have no place for a beginner like me. What’s more, the subjects that are addressed are so narrow that they turn into conversations restricted to a few participants and many uninvolved witnesses (when they take place on listservs, many unread emails). Experts talking to experts – great for solving specific technical problems, horrible for sharing the general knowledge and thoughts that spark real insight.

What I thought was greatest about ISWC was that nearly everyone was an expert at something different – which meant that any discussion with a largish audience or any happenstance encounter between two specialists (say, a natural language researcher and an RDF programmer) had to avoid the technical jargon of either specialty and instead frame everything with the best precision afforded by regular English (believe me, this is much more difficult than dipping into a pre-made vocabulary). Not only is that a good exercise for anyone who wants understand her own subject area more clearly, it’s also the best way to discover parallels between disparate fields. Whereas a psychologist might have nothing to say about “distributed computing,” when talking about “parallel processing,” he may actually turn out to be quite the expert. It’s almost a test of ontology-matching – suddenly finding out that the concept I represent as X in my specialty’s categorization structure is practically the same as what you happened to call Y in yours. What’s the significance of that? As David Karger and Jim Hendler stressed at Wednesday’s mentoring lunch, that discovery is usually one of the best conditions for creative insight. Science is well accustomed to seeing a person with a problem figure out how to solve it by happening upon a person with the solution in another field. All they needed was to run into each other.

But where is that uncontrolled, randomly-matching social space online? Online, communications function more as structured discussions than freewheeling conversations. In shedding their haphazardness, they lose a lot of creativity.

I should probably point out that this isn’t particularly a problem of the Semantic Web; fragmentation and increasingly self-selected groups are a byproduct of the web in general. It may become intensified by the Semantic Web – which decreases the randomness and facilitates intentional self-selection on the web – but it’s a problem for everyone. Only maybe a little more so for scientists, who have always tended to break into autonomous, self-referential subgroups of experts rather easily.

Specialized, self-selected communities reinforce their own ideas and biases and make insightful leaps more difficult, to the detriment of all fields. The web facilitates self-selection and thus fragmentation. What is to be done?


Ontology Alignment (is not the SameAs but is CloselyRelatedTo) Reconciling Worldviews

For the next three days, I’ll be reporting from the 8th International Semantic Web Conference (ISWC), taking place near Washington DC. A lot of what’s going on here is very technical, so rather than repeat everything I’m hearing, I’m going to talk about the broader themes that I see emerging. After this conference, I may try to tie them together into one comprehensive post.

This is my first theme. It’s about ontology alignment but is nevertheless very interesting. Yes, actually, it really is.

An ontology is basically a taxonomy of concepts and categories and the relationships between them – it’s sort of like a network but includes heritability (if I specify properties about some group, like “dogs can bark,” then it carries down to things within that group, so we know that Shih Tzus can bark). Ontologies are pretty key to the Semantic Web because expressing relationships between concepts is essentially defining those concepts – I could turn philosopher and argue that the meaning of something can only be found in the way it relates to other things. Or I could not, and just argue that defining things in terms of their relationships is a really useful way to do it, especially if the point is to make machines understand those things and be able to reason about them. That’s why a large percentage of the people here are obsessed with building ontologies about certain things (like jet engines).

But ontologies are personal. What if I think of “Shih Tzu” as a sub-category of “pets” but you think it belongs under “dinner proteins?” Or how about if a liberal defines a homosexual relationship as a type of family and a conservative thinks it belongs under sexual perversion? There’s no way the world would ever be able to agree on one definitive ontology. Nor should it. The way we categorize things, the way we cut up and connect up everything in the world is key to who we are, how we think, and what we do. I – an atheist and cognitive psychology nerd – would go so far as to say that the human soul exists in our subjective, idiosyncratic ways of linking up information. So to impose a single ontology on the whole world – no matter how well thought out and exhaustive it is – would be tantamount to mind control or soul stealing.

To their credit, most semantic technologists I’ve talked to think this way also. That’s why they’re encouraging ontologies to be fruitful and multiply and represent as many worldviews as there are ontology-builders (though ideally there would be more than 15. (I’m joking, I’m sure there are over 22 people who can build ontologies)). But having a bunch of rivaling ontologies out there that define and categorize things in unique ways doesn’t sound like much of an organized system of data, right? That’s true, and that’s why a lot of other people are involved in aligning ontologies – matching up the instances of some concept that shows up in different ontologies.

But…they’re still not doing it that well. That’s something Pat Hayes brought up during his keynote this morning. His topic was “blogic,” or, the new form of logic (formal logic) that’s required for the web. One of his problems with using traditional logic for the web is that people are mapping instances between different ontologies using the relationship “SameAs” – even though the fact that they come from different ontologies means they’re clearly not the same as each other. People are usually aware of that, but there’s still not much they can do because there’s no “SortOfSameAs” or “SameAsInThisOneParticularWay” relationships in traditional logic that they can use instead.

Ontology alignment is still a Big Problem and it’s acknowledged as such by much of the Semantic Web community. If anyone knows of good solutions in the works, I’d love to hear about them or add to this post with some comments.