not so quiet …
This site has been quiet but there’s a lot going on behind the scenes. Items of note:
- Parallel R has landed! A great thanks to all who made it possible. Happy reading.
- I’m planning some software updates, and forqlift is in the top slot.
- There’s another fun project brewing … more details soon.
new book on the way: Parallel R
As promised, I have an announcement:
It’s a book!
Well, more like, a book-to-be. I’ve signed on with the fine folks at O’Reilly to publish Parallel R. It’s all about giving R, everyone’s preferred open-source data analysis tool, a parallel boost. If you’re doing large-scale work with R, then likely you’ll want to read this book. Especially if you’d like to blend R and Hadoop.
This will not be a solo venture: my partner in crime will be none other than Stephen Weston. Even if you don’t know him by name (and really, you should), there’s a good chance you know his work: he wrote the R packages nws, foreach, doSNOW, and doMC.
Look forward to more announcements over time.
news next week
I have some pretty cool news to announce. Drop by early next week for the full story.
forqlift 0.8.0 (alpha!): direct HDFS interaction
forqlift 0.8.0 is hot off the presses!
Well, it may be a little half-baked, still. This release contains an experimental new feature, which is why I label it alpha-quality. With enough feedback I may upgrade this release to “stable.”
In short: the feature is direct access to HDFS. That means you can write SequenceFiles to, and read them from, HDFS without the intermediate upload/download step.
Details are available on the forqlift download page. If you’re inclined to help test this functionality, please download forqlift 0.8.0 and try it out!