<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	>
<channel>
	<title>Comments on: Bringing the New York Times&#8217; Cornucopia to All</title>
	<atom:link href="http://citmedia.org/blog/2007/10/22/bringing-the-new-york-times-cornucopia-to-all/feed/" rel="self" type="application/rss+xml" />
	<link>http://citmedia.org/blog/2007/10/22/bringing-the-new-york-times-cornucopia-to-all/</link>
	<description></description>
	<pubDate>Mon, 08 Sep 2008 18:42:27 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.6.1</generator>
		<item>
		<title>By: HorsesAss.Org&#187; Blog Archive &#187; Tuesday Morning Blews</title>
		<link>http://citmedia.org/blog/2007/10/22/bringing-the-new-york-times-cornucopia-to-all/#comment-151990</link>
		<dc:creator>HorsesAss.Org&#187; Blog Archive &#187; Tuesday Morning Blews</dc:creator>
		<pubDate>Tue, 23 Oct 2007 14:29:21 +0000</pubDate>
		<guid isPermaLink="false">http://citmedia.org/blog/2007/10/22/bringing-the-new-york-times-cornucopia-to-all/#comment-151990</guid>
		<description>[...] those things that lead to places we simply cannot foresee (including a dead end). But Dan Gillmor has it right when he lauds The Times for opening up its data stream to outside resources &#8212; in the cause of [...]</description>
		<content:encoded><![CDATA[<p>[...] those things that lead to places we simply cannot foresee (including a dead end). But Dan Gillmor has it right when he lauds The Times for opening up its data stream to outside resources &#8212; in the cause of [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: The New York Times river flows, but whereto? &#171; Alexander van Elsas&#8217;s Weblog on new media &#38; technologies and their effect on social behavior</title>
		<link>http://citmedia.org/blog/2007/10/22/bringing-the-new-york-times-cornucopia-to-all/#comment-151989</link>
		<dc:creator>The New York Times river flows, but whereto? &#171; Alexander van Elsas&#8217;s Weblog on new media &#38; technologies and their effect on social behavior</dc:creator>
		<pubDate>Tue, 23 Oct 2007 13:46:03 +0000</pubDate>
		<guid isPermaLink="false">http://citmedia.org/blog/2007/10/22/bringing-the-new-york-times-cornucopia-to-all/#comment-151989</guid>
		<description>[...] it on his blog post announcement. I do see a lot of comments from people in the field (techies, here, here and here for example) that really like the work and the possibilities of the [...]</description>
		<content:encoded><![CDATA[<p>[...] it on his blog post announcement. I do see a lot of comments from people in the field (techies, here, here and here for example) that really like the work and the possibilities of the [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Scripting News for 10/22/07 &#171; Scripting News Annex</title>
		<link>http://citmedia.org/blog/2007/10/22/bringing-the-new-york-times-cornucopia-to-all/#comment-151985</link>
		<dc:creator>Scripting News for 10/22/07 &#171; Scripting News Annex</dc:creator>
		<pubDate>Tue, 23 Oct 2007 03:07:04 +0000</pubDate>
		<guid isPermaLink="false">http://citmedia.org/blog/2007/10/22/bringing-the-new-york-times-cornucopia-to-all/#comment-151985</guid>
		<description>[...] Dan Gillmor: &#8220;Dave Winer has been exploring a superb news resource, exploring the depth and breadth of the New York Times&#8216; data-stream.&#8221; [...]</description>
		<content:encoded><![CDATA[<p>[...] Dan Gillmor: &#8220;Dave Winer has been exploring a superb news resource, exploring the depth and breadth of the New York Times&#8216; data-stream.&#8221; [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jon Garfunkel</title>
		<link>http://citmedia.org/blog/2007/10/22/bringing-the-new-york-times-cornucopia-to-all/#comment-151984</link>
		<dc:creator>Jon Garfunkel</dc:creator>
		<pubDate>Mon, 22 Oct 2007 21:26:56 +0000</pubDate>
		<guid isPermaLink="false">http://citmedia.org/blog/2007/10/22/bringing-the-new-york-times-cornucopia-to-all/#comment-151984</guid>
		<description>Irony of ironies.

When RSS was being conceived in the summer of 2000, there were two basic camps. Rael Dornfest saw in it a way to make a true RDF query engine. Imagine: anybody could then build their own query engine to universally query blogs + news + anything RDF (by *anybody*, I don't mean *everybody* -- it's just under the notion that if anybody can build their own javascript libraries, you'll have a fierce Darwinian competition)

But Dave wanted to make it simple. ("I find the activity towards 'modularization' to be dry and uninteresting.") So, things like keywords (and a bunch of useful metadata out of NewsML) were left out and were left as an exercise to the developer. Hence many different Atom extensions these days, none of which are wholly standard.

And, here's the irony: the Times is not supplying the metadata in their RSS2.0 feeds-- they are putting them in the HTML, leaving Dave to scrape them from there! ha ha!

And, of course, Google and Topix already do this clustering technique-- across thousands of sources. They scrape, because news organizations do not make metadata-rich NewsML feeds available...

Not that pulling META tags out of an HTML document is hard. It's just curious that, if you wind the clock back 7 years, and tell a bunch of software engineers that the coolest news clustering applications today would still rely on HTML scraping, I don't know whether they'd laugh or cry.</description>
		<content:encoded><![CDATA[<p>Irony of ironies.</p>
<p>When RSS was being conceived in the summer of 2000, there were two basic camps. Rael Dornfest saw in it a way to make a true RDF query engine. Imagine: anybody could then build their own query engine to universally query blogs + news + anything RDF (by *anybody*, I don&#8217;t mean *everybody* &#8212; it&#8217;s just under the notion that if anybody can build their own javascript libraries, you&#8217;ll have a fierce Darwinian competition)</p>
<p>But Dave wanted to make it simple. (&#8221;I find the activity towards &#8216;modularization&#8217; to be dry and uninteresting.&#8221;) So, things like keywords (and a bunch of useful metadata out of NewsML) were left out and were left as an exercise to the developer. Hence many different Atom extensions these days, none of which are wholly standard.</p>
<p>And, here&#8217;s the irony: the Times is not supplying the metadata in their RSS2.0 feeds&#8211; they are putting them in the HTML, leaving Dave to scrape them from there! ha ha!</p>
<p>And, of course, Google and Topix already do this clustering technique&#8211; across thousands of sources. They scrape, because news organizations do not make metadata-rich NewsML feeds available&#8230;</p>
<p>Not that pulling META tags out of an HTML document is hard. It&#8217;s just curious that, if you wind the clock back 7 years, and tell a bunch of software engineers that the coolest news clustering applications today would still rely on HTML scraping, I don&#8217;t know whether they&#8217;d laugh or cry.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
