In recent weeks, my RSS reader has delivered a bunch of great articles and announcements related to the Semantic Web, structure and meaning, and “Web 3.0″. There seems to be convergence on the definition of the Semantic Web (web content infused with meta-data) and promising approaches for inferring meaning from unstructured data (domain-specific, ontologies, rules, etc.). The term “Web 3.0″ is not so lucky – reactions range from disdain to dismissal. A few folks were brave enough to proffer opinions and were met with skepticism and rejection on the grounds of the definitions being self-serving. Tough crowd.
At MashLogic we intend to use a variety of techniques to extract concepts and meaning from web content. Such content is semi-structured (pockets of unstructured text blocks within a structure of headings, paragraphs, links, and the like). Our approach is opportunistic in the sense that we will take formal meta-data if we find it, use feeds and APIs when they are available, or apply simple ontologies and rules and see where that takes us.
Getting to the topic of this post… regardless of whether applications match keywords, or apply sophisticated semantic analysis techniques to tease out meaning from content, the overarching goal is generally one of Matching Intent between Consumers and Providers. Content providers use page structure, meta tags, microformats, and markup to give visibility and access to their inventory of information, services, and products. These days most content providers are catering to two constituencies – search engines (and their sinister proxy, SEO) and consumers. On the other side, consumers explicitly express their Intent with keywords and preferences. Providers often combine this with implicit data from clickstreams, demographics, etc.
Information Retrieval implies a paradigm where the Consumer judiciously charts his course through an ocean of information and is ultimately responsible for ensuring that the retrieved information meets his needs. I’d like to think that we are approaching a time of Intent Reconciliation where Providers and Consumers co-operate in helping each other consummate their Intent. They do this by reducing the Ambiguity Gap between Intent and Information with Semantics. From that perspective, Semantics codifies Intent.

[...] Semantic Web is an enabling technology that improves the matching of intent between Consumers and Providers. That makes it something like DNS – end-users are generally unaware of its existence. Hence, I [...]