Category: Technical Articles
The Math of Secret Santa
And finally... Richard muses on the question of how many turns it takes to randomly select a valid Secret Santa assignment.
CQL benchmarking
CQL is significantly easier to use and develop against than Cassandra's default Thrift interface; but does it impose a correspondingly significant performance cost? In a word, "no".
How to rebuild 2TB disks in 30mins
A comparison of disk rebuild times, showing that 2-RDA is the way of the future for data that requires redundancy against disk failure
CQL Quick Reference Card
CQL Quick Reference cards are an ideal companion to any developer or software engineer using or thinking about using the Cassandra Query Language. Written by Cassandra Committer Eric Evans, there ...
Data Modelling with Cassandra
Denormalisation is essential at scale, and Cassandra's read/write tradeoff is well-adapted for it. In this article we work through an example use case showing how this works in ...
Cassandra Drivers Released!
Announcing JDBC and DB-API2 drivers for Cassandra. The Apache Cassandra project is moving its driver projects to Apache Extras; read all about it here.
Scaling up Cassandra and Mahout with Hadoop
My last article, "Recommending (from) Cassandra", introduced the possibility of learning from all of that data that you've been squirreling away in your Apache Cassandra instance. Using another Apache ...
Benchmarking LevelDB
Google recently open-sourced their LevelDB datastore and published some neat benchmarks. Those benchmarks typically insert at most 1e6 entries, and I wanted to understand how it performs both for larger ...
Why theory fails for SSDs
A rich theory has been developed to explain and compare the complexity of algorithms for data sets that do not fit in a computer's main memory [1]. Yet this ...
Recommending (from) Cassandra
If you're reading this, then you are probably using Apache Cassandra (http://cassandra.apache.org), or are wondering why you should care. Maybe you, or the hip startup in ...