<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Acts as Subversioned: The Future.</title>
	<atom:link href="http://www.oneofthewolves.com/2007/04/24/acts-as-subversioned-the-future/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.oneofthewolves.com/2007/04/24/acts-as-subversioned-the-future/</link>
	<description>Me, rebranded.</description>
	<lastBuildDate>Sun, 04 Jul 2010 04:12:17 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0</generator>
	<item>
		<title>By: Hendy Irawan</title>
		<link>http://www.oneofthewolves.com/2007/04/24/acts-as-subversioned-the-future/comment-page-1/#comment-29</link>
		<dc:creator>Hendy Irawan</dc:creator>
		<pubDate>Thu, 23 Aug 2007 17:22:54 +0000</pubDate>
		<guid isPermaLink="false">http://www.oneofthewolves.com/2007/04/24/acts-as-subversioned-present-future-and-a-little-help/#comment-29</guid>
		<description>Bazaar seems also good :)</description>
		<content:encoded><![CDATA[<p>Bazaar seems also good <img src='http://www.oneofthewolves.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Ben</title>
		<link>http://www.oneofthewolves.com/2007/04/24/acts-as-subversioned-the-future/comment-page-1/#comment-30</link>
		<dc:creator>Ben</dc:creator>
		<pubDate>Sun, 06 May 2007 20:12:54 +0000</pubDate>
		<guid isPermaLink="false">http://www.oneofthewolves.com/2007/04/24/acts-as-subversioned-present-future-and-a-little-help/#comment-30</guid>
		<description>I was just thinking, and I did a bit of research.

For the purpose of what I said above, and this comment, I&#039;m calling the SQL equivalent of &#039;insert/update&#039; an update, and the SQL equivalent of &#039;select&#039; a query.

I&#039;m not sure if this will work or not, but if you call &#039;update&#039; one dimension, and &#039;query&#039; another dimension, you can probably do a multi-dimensional FFT.  It just so happens that there&#039;s a nice module written in C to do just this sort of thing from Ruby.

http://ruby.gfd-dennou.org/products/ruby-fftw3/doc/ruby-fftw3.html

I think in order to treat them like orthogonal components, you need to prove them to be statistically independent.  However, if this is possible, it removes the directional correlation problem and reduces caching to a multi-dimensional FFT and a knapsack problem (filling your online buffer).

We can talk more in person - I don&#039;t want to keep filling up your blog :-p</description>
		<content:encoded><![CDATA[<p>I was just thinking, and I did a bit of research.</p>
<p>For the purpose of what I said above, and this comment, I&#8217;m calling the SQL equivalent of &#8216;insert/update&#8217; an update, and the SQL equivalent of &#8216;select&#8217; a query.</p>
<p>I&#8217;m not sure if this will work or not, but if you call &#8216;update&#8217; one dimension, and &#8216;query&#8217; another dimension, you can probably do a multi-dimensional FFT.  It just so happens that there&#8217;s a nice module written in C to do just this sort of thing from Ruby.</p>
<p><a href="http://ruby.gfd-dennou.org/products/ruby-fftw3/doc/ruby-fftw3.html" rel="nofollow">http://ruby.gfd-dennou.org/products/ruby-fftw3/doc/ruby-fftw3.html</a></p>
<p>I think in order to treat them like orthogonal components, you need to prove them to be statistically independent.  However, if this is possible, it removes the directional correlation problem and reduces caching to a multi-dimensional FFT and a knapsack problem (filling your online buffer).</p>
<p>We can talk more in person &#8211; I don&#8217;t want to keep filling up your blog :-p</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Ben</title>
		<link>http://www.oneofthewolves.com/2007/04/24/acts-as-subversioned-the-future/comment-page-1/#comment-31</link>
		<dc:creator>Ben</dc:creator>
		<pubDate>Sun, 06 May 2007 19:58:48 +0000</pubDate>
		<guid isPermaLink="false">http://www.oneofthewolves.com/2007/04/24/acts-as-subversioned-present-future-and-a-little-help/#comment-31</guid>
		<description>You said --
&quot;but Iâ€™m pretty sure I can replicate all the â€œfind_by_Xâ€ functions as well as the id-based model structure, in (hopefully) O(1) time&quot;

I don&#039;t know if you remember or not, but I think when you were at the AFRL that summer, I built a little tool to export the structure of an Oracle database to a series of HTML files.  The tool used metadata queries to grab a current version of the schemas, tables, fields, and constraints.  The first time I did it, I think I managed to do it in O(n^2), but I was querying O(n) times.  The program completed its task in about 6 minutes.

After talking with Nick Watts (our database guy) a bit, I found a way to get all the data I needed from a single query, but rebuilding the structure from the query operated in O(n^3) time.  This completed in under 5 seconds.

I might be off-base here, since I don&#039;t know Ruby or ActiveRecord, but keep in mind that the big thing with this type of code isn&#039;t really making sure that your online processing time is low, it&#039;s minimizing the number of queries performed on your offline structure (in this case, subversion).  However, If you can keep your query count low, and have constant time, then you&#039;re golden.

Hopefully my (somewhat) unsolicited advice is useful.  If not, feel free to tell me to buzz off :-)

As for the solicited advice, I think the best way to tell which is faster would be to profile them both. In some ways, removing a middle-man is a good thing, but I think your obvious concern is that MySQL might be a beneficial caching mechanism.  Unfortunately, the only way to determine what&#039;s faster is empirical analysis.

I (once again) don&#039;t know ActiveRecord at all, but I think the *best* (yet hardest to code) solution would be to write in a layer that identifies frequently queried data, and caches it based on frequency of access and an upper memory bound.  If you wanted to get really fancy, you could do an FFT on your data&#039;s update timestamps, an FFT on the query timestamps, and then cache the data that is both at the high-end of the query spectrum, but the low end of the update spectrum.  I think there are some &#039;directional correlation&#039; algorithms that will handle this well.  Of course, you&#039;d have to implement a smart pedigree mechanism - but if you&#039;re correlating FFTs on subversion records, I think you can handle that. :-)

Wow, I babble a lot...  I hope this was useful, and congrats again for graduating.</description>
		<content:encoded><![CDATA[<p>You said &#8211;<br />
&#8220;but Iâ€™m pretty sure I can replicate all the â€œfind_by_Xâ€ functions as well as the id-based model structure, in (hopefully) O(1) time&#8221;</p>
<p>I don&#8217;t know if you remember or not, but I think when you were at the AFRL that summer, I built a little tool to export the structure of an Oracle database to a series of HTML files.  The tool used metadata queries to grab a current version of the schemas, tables, fields, and constraints.  The first time I did it, I think I managed to do it in O(n^2), but I was querying O(n) times.  The program completed its task in about 6 minutes.</p>
<p>After talking with Nick Watts (our database guy) a bit, I found a way to get all the data I needed from a single query, but rebuilding the structure from the query operated in O(n^3) time.  This completed in under 5 seconds.</p>
<p>I might be off-base here, since I don&#8217;t know Ruby or ActiveRecord, but keep in mind that the big thing with this type of code isn&#8217;t really making sure that your online processing time is low, it&#8217;s minimizing the number of queries performed on your offline structure (in this case, subversion).  However, If you can keep your query count low, and have constant time, then you&#8217;re golden.</p>
<p>Hopefully my (somewhat) unsolicited advice is useful.  If not, feel free to tell me to buzz off <img src='http://www.oneofthewolves.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
<p>As for the solicited advice, I think the best way to tell which is faster would be to profile them both. In some ways, removing a middle-man is a good thing, but I think your obvious concern is that MySQL might be a beneficial caching mechanism.  Unfortunately, the only way to determine what&#8217;s faster is empirical analysis.</p>
<p>I (once again) don&#8217;t know ActiveRecord at all, but I think the *best* (yet hardest to code) solution would be to write in a layer that identifies frequently queried data, and caches it based on frequency of access and an upper memory bound.  If you wanted to get really fancy, you could do an FFT on your data&#8217;s update timestamps, an FFT on the query timestamps, and then cache the data that is both at the high-end of the query spectrum, but the low end of the update spectrum.  I think there are some &#8216;directional correlation&#8217; algorithms that will handle this well.  Of course, you&#8217;d have to implement a smart pedigree mechanism &#8211; but if you&#8217;re correlating FFTs on subversion records, I think you can handle that. <img src='http://www.oneofthewolves.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
<p>Wow, I babble a lot&#8230;  I hope this was useful, and congrats again for graduating.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
