<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Offline Wikipedia</title>
	<atom:link href="http://jsomers.net/blog/offline-wikipedia/feed" rel="self" type="application/rss+xml" />
	<link>http://jsomers.net/blog/offline-wikipedia</link>
	<description>James Somers</description>
	<lastBuildDate>Fri, 12 Mar 2010 03:03:38 -0600</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.5</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: Crispin</title>
		<link>http://jsomers.net/blog/offline-wikipedia/comment-page-1#comment-3403</link>
		<dc:creator>Crispin</dc:creator>
		<pubDate>Wed, 17 Feb 2010 12:29:11 +0000</pubDate>
		<guid isPermaLink="false">http://jsomers.net/blog/offline-wikipedia#comment-3403</guid>
		<description>&lt;p&gt;Thanks for the instructions, I used a very similar method a few years back but never really got it to work.&lt;/p&gt;

&lt;p&gt;For those wanting something simpler (maybe a slightly different use case) I strongly recommend aarddict (http://aarddict.org/).&lt;/p&gt;

&lt;p&gt;It&#039;s a free, open-source (GPL) and extremely fast program which also provides ready built databases for all wikipedia text content. 
You don&#039;t get pictures (far too large) but you do get all the text and latex (math formulae) nicely rendered with working links, tables, formatting, etc. 
It is available for Ubuntu, Windows, OS-X and Nokia tablet (and source obviously)&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>Thanks for the instructions, I used a very similar method a few years back but never really got it to work.</p>

<p>For those wanting something simpler (maybe a slightly different use case) I strongly recommend aarddict (<a href="http://aarddict.org/)" rel="nofollow">http://aarddict.org/)</a>.</p>

<p>It&#8217;s a free, open-source (GPL) and extremely fast program which also provides ready built databases for all wikipedia text content. 
You don&#8217;t get pictures (far too large) but you do get all the text and latex (math formulae) nicely rendered with working links, tables, formatting, etc. 
It is available for Ubuntu, Windows, OS-X and Nokia tablet (and source obviously)</p>]]></content:encoded>
	</item>
	<item>
		<title>By: Reverso time &#171; jsomers.net</title>
		<link>http://jsomers.net/blog/offline-wikipedia/comment-page-1#comment-3357</link>
		<dc:creator>Reverso time &#171; jsomers.net</dc:creator>
		<pubDate>Wed, 10 Feb 2010 16:22:43 +0000</pubDate>
		<guid isPermaLink="false">http://jsomers.net/blog/offline-wikipedia#comment-3357</guid>
		<description>&lt;p&gt;[...] posts discussing build options and compile errors and PATH variables. Examples fresh in my mind: offline Wikipedia, Metacat, and the Ruby wrapper for the Stanford Natural Language [...]&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>[...] posts discussing build options and compile errors and PATH variables. Examples fresh in my mind: offline Wikipedia, Metacat, and the Ruby wrapper for the Stanford Natural Language [...]</p>]]></content:encoded>
	</item>
	<item>
		<title>By: emc986</title>
		<link>http://jsomers.net/blog/offline-wikipedia/comment-page-1#comment-3353</link>
		<dc:creator>emc986</dc:creator>
		<pubDate>Tue, 02 Feb 2010 19:30:31 +0000</pubDate>
		<guid isPermaLink="false">http://jsomers.net/blog/offline-wikipedia#comment-3353</guid>
		<description>&lt;p&gt;-- a simple stream edit will certainly not work, as it turns out.  The image links are php-generated, I assume.  Still, it shouldn&#039;t be too difficult to adjust some code.&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>&#8211; a simple stream edit will certainly not work, as it turns out.  The image links are php-generated, I assume.  Still, it shouldn&#8217;t be too difficult to adjust some code.</p>]]></content:encoded>
	</item>
	<item>
		<title>By: emc986</title>
		<link>http://jsomers.net/blog/offline-wikipedia/comment-page-1#comment-3352</link>
		<dc:creator>emc986</dc:creator>
		<pubDate>Tue, 02 Feb 2010 09:52:50 +0000</pubDate>
		<guid isPermaLink="false">http://jsomers.net/blog/offline-wikipedia#comment-3352</guid>
		<description>&lt;p&gt;Glad to see an ongoing discussion on this very excellent offline reader. Perhaps a simple stream edit on the bz2 splits would allow, if online, the retrieval of images.  The current enwiki-latest-pages-articles.xml.bz2 is 5.4G, and with the generated index (another 3G!), the damn thing just won&#039;t fit on my current 8G uSD for use on a freerunner.  I don&#039;t how much larger the dump would be including images but I&#039;d guess, maybe using this great method, it wouldn&#039;t even fit on a 32G uSD card.&lt;/p&gt;

&lt;p&gt;In any case, here&#039;s a problem:&lt;/p&gt;

&lt;p&gt;This reader has been working beautifully excepting one thing.
Pages that have the country infobox do not render the data, for instance on the Afghanistan page the infobox information exists within the file but the rendered page gives a default template value, for example:&lt;/p&gt;

&lt;p&gt;Capital city:    {{{countrycapital}}}&lt;/p&gt;

&lt;p&gt;I&#039;m not really sure where to start looking as I&#039;m not very familiar with the wikimedia parser.&lt;/p&gt;

&lt;p&gt;Does anyone else have a clue?&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>Glad to see an ongoing discussion on this very excellent offline reader. Perhaps a simple stream edit on the bz2 splits would allow, if online, the retrieval of images.  The current enwiki-latest-pages-articles.xml.bz2 is 5.4G, and with the generated index (another 3G!), the damn thing just won&#8217;t fit on my current 8G uSD for use on a freerunner.  I don&#8217;t how much larger the dump would be including images but I&#8217;d guess, maybe using this great method, it wouldn&#8217;t even fit on a 32G uSD card.</p>

<p>In any case, here&#8217;s a problem:</p>

<p>This reader has been working beautifully excepting one thing.
Pages that have the country infobox do not render the data, for instance on the Afghanistan page the infobox information exists within the file but the rendered page gives a default template value, for example:</p>

<p>Capital city:    {{{countrycapital}}}</p>

<p>I&#8217;m not really sure where to start looking as I&#8217;m not very familiar with the wikimedia parser.</p>

<p>Does anyone else have a clue?</p>]]></content:encoded>
	</item>
	<item>
		<title>By: Abhay</title>
		<link>http://jsomers.net/blog/offline-wikipedia/comment-page-1#comment-3327</link>
		<dc:creator>Abhay</dc:creator>
		<pubDate>Mon, 02 Nov 2009 09:46:51 +0000</pubDate>
		<guid isPermaLink="false">http://jsomers.net/blog/offline-wikipedia#comment-3327</guid>
		<description>&lt;p&gt;Sorry. the above was a silly error...the actual problem is this:&lt;/p&gt;

&lt;p&gt;./show.pl &quot;../wiki-splits/rec09696enwiki-20090929-pages-articles.xml.bz2&quot; &quot;Wikipedia&quot;
sh: php: not found&lt;/p&gt;

&lt;p&gt;mediawiki_sa parser failed! report to woc.fslab.de&lt;/p&gt;

&lt;p&gt;i have installed Php5.3.0 from the net.&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>Sorry. the above was a silly error&#8230;the actual problem is this:</p>

<p>./show.pl &#8220;../wiki-splits/rec09696enwiki-20090929-pages-articles.xml.bz2&#8243; &#8220;Wikipedia&#8221;
sh: php: not found</p>

<p>mediawiki_sa parser failed! report to woc.fslab.de</p>

<p>i have installed Php5.3.0 from the net.</p>]]></content:encoded>
	</item>
	<item>
		<title>By: Mason</title>
		<link>http://jsomers.net/blog/offline-wikipedia/comment-page-1#comment-3316</link>
		<dc:creator>Mason</dc:creator>
		<pubDate>Sun, 25 Oct 2009 18:36:18 +0000</pubDate>
		<guid isPermaLink="false">http://jsomers.net/blog/offline-wikipedia#comment-3316</guid>
		<description>&lt;p&gt;I can&#039;t seem to build the reader. I&#039;ve downloaded everything, built Xapian and installed Django through Mac Ports (and I already had the dev tools). &quot;sudo make&quot; tells me to use &quot;make wikipedia,&quot; which gives me this error:
    usage: cp [-R [-H &#124; -L &#124; -P]] [-fi &#124; -n] [-pvX] source_file target_file
           cp [-R [-H &#124; -L &#124; -P]] [-fi &#124; -n] [-pvX] source_file ... target_directory
    make: *** [mediawiki] Error 64&lt;/p&gt;

&lt;p&gt;I&#039;ve looked through everything again, but didn&#039;t forget anything. I tried searching Google for &quot;make Error 64&quot; but didn&#039;t find anything, and I&#039;ve tried Django with Python 2.5 and 2.6. Does anyone have idea what my problem might be? Is there any other information that I should provide?&lt;/p&gt;

&lt;p&gt;Thanks,
Mason&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>I can&#8217;t seem to build the reader. I&#8217;ve downloaded everything, built Xapian and installed Django through Mac Ports (and I already had the dev tools). &#8220;sudo make&#8221; tells me to use &#8220;make wikipedia,&#8221; which gives me this error:
    usage: cp [-R [-H | -L | -P]] [-fi | -n] [-pvX] source_file target_file
           cp [-R [-H | -L | -P]] [-fi | -n] [-pvX] source_file &#8230; target_directory
    make: *** [mediawiki] Error 64</p>

<p>I&#8217;ve looked through everything again, but didn&#8217;t forget anything. I tried searching Google for &#8220;make Error 64&#8243; but didn&#8217;t find anything, and I&#8217;ve tried Django with Python 2.5 and 2.6. Does anyone have idea what my problem might be? Is there any other information that I should provide?</p>

<p>Thanks,
Mason</p>]]></content:encoded>
	</item>
	<item>
		<title>By: Kerry</title>
		<link>http://jsomers.net/blog/offline-wikipedia/comment-page-1#comment-3315</link>
		<dc:creator>Kerry</dc:creator>
		<pubDate>Sun, 25 Oct 2009 02:32:03 +0000</pubDate>
		<guid isPermaLink="false">http://jsomers.net/blog/offline-wikipedia#comment-3315</guid>
		<description>&lt;p&gt;Worked great for me with the german version of wikipedia. The only issue is with searching using umlauts - python doesnt seem to like unicode here. Quite a major one really for German. I intend having a look some time at how to get both de and en versions running at the same time.&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>Worked great for me with the german version of wikipedia. The only issue is with searching using umlauts &#8211; python doesnt seem to like unicode here. Quite a major one really for German. I intend having a look some time at how to get both de and en versions running at the same time.</p>]]></content:encoded>
	</item>
	<item>
		<title>By: James Somers</title>
		<link>http://jsomers.net/blog/offline-wikipedia/comment-page-1#comment-3313</link>
		<dc:creator>James Somers</dc:creator>
		<pubDate>Wed, 14 Oct 2009 20:06:30 +0000</pubDate>
		<guid isPermaLink="false">http://jsomers.net/blog/offline-wikipedia#comment-3313</guid>
		<description>&lt;p&gt;Glad to hear you fixed the first problem. I must have been using a different version of PHP.&lt;/p&gt;

&lt;p&gt;There may well be a 64-bit version of the software necessary to compile texvc. Do reply again if you&#039;re able to fix that.&lt;/p&gt;

&lt;p&gt;The dump itself does not include images, largely because they would increase the file size by orders of magnitude. If you can spare the space then you&#039;ll have to (a) download all the images separately (which I think you can do) and (b) fix all the wiki&#039;s image links so that they point to your local copies.&lt;/p&gt;

&lt;p&gt;Note that you don&#039;t need the images to get math to display properly, since those PNGs are rendered by texvc.&lt;/p&gt;

&lt;p&gt;Personally I get a lot of mileage out of the offline wikipedia without images, but I suppose it could be helpful to have diagrams and the like. Let us know if you&#039;re able to integrate images successfully.&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>Glad to hear you fixed the first problem. I must have been using a different version of PHP.</p>

<p>There may well be a 64-bit version of the software necessary to compile texvc. Do reply again if you&#8217;re able to fix that.</p>

<p>The dump itself does not include images, largely because they would increase the file size by orders of magnitude. If you can spare the space then you&#8217;ll have to (a) download all the images separately (which I think you can do) and (b) fix all the wiki&#8217;s image links so that they point to your local copies.</p>

<p>Note that you don&#8217;t need the images to get math to display properly, since those PNGs are rendered by texvc.</p>

<p>Personally I get a lot of mileage out of the offline wikipedia without images, but I suppose it could be helpful to have diagrams and the like. Let us know if you&#8217;re able to integrate images successfully.</p>]]></content:encoded>
	</item>
	<item>
		<title>By: kvaruni</title>
		<link>http://jsomers.net/blog/offline-wikipedia/comment-page-1#comment-3312</link>
		<dc:creator>kvaruni</dc:creator>
		<pubDate>Wed, 14 Oct 2009 15:16:43 +0000</pubDate>
		<guid isPermaLink="false">http://jsomers.net/blog/offline-wikipedia#comment-3312</guid>
		<description>&lt;p&gt;This problem seems to have been fixed and was caused by &quot;Namespace&quot; being a reserved keyword in PHP 5.3.0. The solution is rather messy in that it involves checking all files in the include folder for references to Namespace:: and renaming that (as well as the class) to MyNamespace. I would advise using Smultron for this job as it can search through all the files that are open by using Advanced Find.&lt;/p&gt;

&lt;p&gt;And then there were more problems. Splitting wnet fine, but indexing did not. The cause is a mysterious &quot;Exception: Empty termnames aren&#039;t allowed&quot;. Ah, this tells me what I need to know. Not. Nevertheless, about half of the wikipedia dump has been indexed and I can get started. Except ... I can get texvc to compile, which would be king of necessary, since most of my research on wikipedia involves math-related subjects. And this time the error is &quot;32-bit absolute addressing is not supported for x86-64&quot;. So basically, I cannot get this thing to compile no matter what I try since it does not support 64-bit, yet my operating system and processor are 64-bit. What a mess.&lt;/p&gt;

&lt;p&gt;Finally a matter of debate. Why are there no images? Is this because the dump failed to parse completely or are these just omitted? In the latter case this would involve an additional download from wikimedia to make this entire idea of an offline version usable. To be honest I find the entire process to be far too hard for a encyclopedia that promotes openness.&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>This problem seems to have been fixed and was caused by &#8220;Namespace&#8221; being a reserved keyword in PHP 5.3.0. The solution is rather messy in that it involves checking all files in the include folder for references to Namespace:: and renaming that (as well as the class) to MyNamespace. I would advise using Smultron for this job as it can search through all the files that are open by using Advanced Find.</p>

<p>And then there were more problems. Splitting wnet fine, but indexing did not. The cause is a mysterious &#8220;Exception: Empty termnames aren&#8217;t allowed&#8221;. Ah, this tells me what I need to know. Not. Nevertheless, about half of the wikipedia dump has been indexed and I can get started. Except &#8230; I can get texvc to compile, which would be king of necessary, since most of my research on wikipedia involves math-related subjects. And this time the error is &#8220;32-bit absolute addressing is not supported for x86-64&#8243;. So basically, I cannot get this thing to compile no matter what I try since it does not support 64-bit, yet my operating system and processor are 64-bit. What a mess.</p>

<p>Finally a matter of debate. Why are there no images? Is this because the dump failed to parse completely or are these just omitted? In the latter case this would involve an additional download from wikimedia to make this entire idea of an offline version usable. To be honest I find the entire process to be far too hard for a encyclopedia that promotes openness.</p>]]></content:encoded>
	</item>
	<item>
		<title>By: kvaruni</title>
		<link>http://jsomers.net/blog/offline-wikipedia/comment-page-1#comment-3311</link>
		<dc:creator>kvaruni</dc:creator>
		<pubDate>Wed, 14 Oct 2009 08:09:57 +0000</pubDate>
		<guid isPermaLink="false">http://jsomers.net/blog/offline-wikipedia#comment-3311</guid>
		<description>&lt;p&gt;I have been trying to get this to work, but I seem to have run into some kind of problem. For starters, I have done everything up to editing the show.pl file where I replaced php5 by php. The make went just fine, the files are nicely split and I can access the server. Great. But each and every page I visit, apart from the search results, tells me that the woc.fslab.de parser failed.&lt;/p&gt;

&lt;p&gt;I took a look at the site from Thanassis and i tried to render the bonsai.html page. Tough luck since this did not work. The error I get is &quot;Parse error: syntax error, unexpected T_NAMESPACE, expecting T_STRING in /Users/kimbauters/Documents/offline.wikipedia/mediawiki_sa/includes/Namespace.php on line 46&quot; which coincides with the beginning of the class in the Namespace.php file. This has left me dead in my tracks since I have no idea how to resolve this. Since I am using Snow Leopard, my PHP version is the latest and greatest PHP 5.3.0. Clearly, this should be able to cope just fine with classes in PHP.&lt;/p&gt;

&lt;p&gt;Any ideas on how to resolve this issue or am I overlooking something?&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>I have been trying to get this to work, but I seem to have run into some kind of problem. For starters, I have done everything up to editing the show.pl file where I replaced php5 by php. The make went just fine, the files are nicely split and I can access the server. Great. But each and every page I visit, apart from the search results, tells me that the woc.fslab.de parser failed.</p>

<p>I took a look at the site from Thanassis and i tried to render the bonsai.html page. Tough luck since this did not work. The error I get is &#8220;Parse error: syntax error, unexpected T_NAMESPACE, expecting T_STRING in /Users/kimbauters/Documents/offline.wikipedia/mediawiki_sa/includes/Namespace.php on line 46&#8243; which coincides with the beginning of the class in the Namespace.php file. This has left me dead in my tracks since I have no idea how to resolve this. Since I am using Snow Leopard, my PHP version is the latest and greatest PHP 5.3.0. Clearly, this should be able to cope just fine with classes in PHP.</p>

<p>Any ideas on how to resolve this issue or am I overlooking something?</p>]]></content:encoded>
	</item>
</channel>
</rss>
