The Code Train

Where Neil Crosby talks about coding on the train…

RSS Entries

WikiSlurp

Posted on September 28th, 2008 by Neil Crosby.

WikiSlurp is a project designed to allow site developers to easily leverage the awesome power of Wikipiedia’s articles.

WikiSlurp is used as a web service that queries Wikipedia to return, in HTML format, portions of articles about a given subject. It’s designed to be slotted into any webserver able to run PHP5, even those on shared hosting accounts. All requests to external services are heavily cached by default, with the actual cache time used being able to be changed by the site owner via the config file.

WikiSlurp is very much alpha software at present. I’m using it on Is Neil Annoyed By and The Ten Word Review though, so it is usable. If you have any comments, feature requests or patches please give them to me. I’m all ears.

Now, download v0.1 0f the source code or get the very latest source code on GitHub.

9 Responses to “WikiSlurp”

  1. Allow CSS for selecting. There’s a symfony plugin which acts as a web browser for making requests and has this function.

  2. Thanks for the feature request Mike. I’ll stick it in the “Things to do” list.

  3. Quick bug: The anchor tags in the HTML block are relative to Wikipedia, thus they don’t work right.

    I’ve also found that the loading can easily take longer than 1-second. Increasing the cURL timeout is one option, but that hangs up the page. It would be great if there was an Ajax request option. Alternatively, I’ve successfully tested loading it into an iframe. A less than elegant solution, but it works.

    Not sure if it’s doable, but many searches return more than one result. In this case it would be awesome if it would return the result list, letting the user choose which result was “right”, possibly storing the result in a table.

    Great work on this. Looking forward to future iterations.

  4. I had another idea. Let’s say a search returned no results, but you know an entry exists (maybe it’s a redirect to another entry). It would be great if you could bypass the search and instead just query the exact URL. Let me know if this doesn’t make sense.

  5. This is exactly what I have been searching for but cannot get it to work. Your GitHub source does not contain the Curl or phpCache files. I tried using the ones from the old source but my searches keep failing even using your examples.

  6. George Cairns Says:

    This is great – exactly what I needed, thank you.

  7. This is a fantastic development, but, does this still work? Just cloned from Github and setup in a folder, but the example form always return nil results.

  8. Having the same problem as Simon above, is there an issue with this now?

  9. I hate saying this here because I know it wont help me. My config file has the secret and my BOSS key, but I’m getting the result ‘no secret given’.

Trackback URI | Comments RSS

Leave a Reply

TheCodeTrain Theme by Neil Crosby, Powered by WordPress