<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Schabby&#039;s Blog &#187; Cassandra</title>
	<atom:link href="http://schabby.de/tag/cassandra/feed/" rel="self" type="application/rss+xml" />
	<link>http://schabby.de</link>
	<description>computer science and binary watches, there isnt probably much more that matters in the known part of the universe</description>
	<lastBuildDate>Thu, 19 Aug 2010 09:54:25 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0</generator>
		<item>
		<title>Cassandra Installation and Configuration</title>
		<link>http://schabby.de/cassandra-installation-configuration/</link>
		<comments>http://schabby.de/cassandra-installation-configuration/#comments</comments>
		<pubDate>Fri, 13 Nov 2009 16:44:42 +0000</pubDate>
		<dc:creator>schabby</dc:creator>
				<category><![CDATA[Java]]></category>
		<category><![CDATA[Cassandra]]></category>

		<guid isPermaLink="false">http://schabby.de/?p=786</guid>
		<description><![CDATA[This is the second post on my little &#8220;Cassandra &#8211; Getting Started&#8221; series covering the installation and basic configuration of Cassandra. Cassandra is extremely easy to set up, especially compared to HBase. All you got to do is to download, extract, edit a single XML-file and run. But let us take it step by step. [...]]]></description>
			<content:encoded><![CDATA[<p>This is the second post on my little &#8220;Cassandra &#8211; Getting Started&#8221; series covering the installation and basic configuration of Cassandra. Cassandra is extremely easy to set up, especially compared to <a href="http://hadoop.apache.org/hbase/">HBase</a>. All you got to do is to download, extract, edit a single XML-file and run. But let us take it step by step.<br />
<span id="more-786"></span><br />
You can <a href="http://incubator.apache.org/cassandra/">download Cassandra</a> directly from it&#8217;s (her?) website. At the time of the submission of this post, version 0.4.1 was the most recent stable. Note that you need <a href="http://java.sun.com/javase/downloads/index.jsp">Java 6</a> installed to run Cassandra which I assume here as properly installed. </p>
<p>After extracting Cassandra to some folder (on my Windows box I placed it directly in <tt>D:\cassandra</tt>), the only file you need to edit is <tt>conf/storage-conf.xml</tt>. While Cassandra is engineered to run on a large number of machines in a network, we start it here as a single node with the default parameter set, so that most of the settings are ok for now. </p>
<p>If your are <b>not</b> on a Unix-like system, you need to update the folders where Cassandra is supposed to store the data. If your using Windows (like me), then find the following lines in <tt>conf/storage-conf.xml</tt> and change the paths to something sensible</p>

<div class="wp_syntax"><div class="code"><pre class="xml" style="font-family:monospace;"><span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;CommitLogDirectory<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>/var/lib/cassandra/commitlog<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/CommitLogDirectory<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;DataFileDirectories<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
      <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;DataFileDirectory<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>/var/lib/cassandra/data<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/DataFileDirectory<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/DataFileDirectories<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;CalloutLocation<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>/var/lib/cassandra/callouts<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/CalloutLocation<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;BootstrapFileDirectory<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>/var/lib/cassandra/bootstrap<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/BootstrapFileDirectory<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;StagingFileDirectory<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>/var/lib/cassandra/staging<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/StagingFileDirectory<span style="color: #000000; font-weight: bold;">&gt;</span></span></span></pre></div></div>

<p>like for example my settings:</p>

<div class="wp_syntax"><div class="code"><pre class="xml" style="font-family:monospace;"><span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;CommitLogDirectory<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>D:/cassandra/data/commitlog<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/CommitLogDirectory<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;DataFileDirectories<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
       <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;DataFileDirectory<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>D:/cassandra/data/data<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/DataFileDirectory<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/DataFileDirectories<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;CalloutLocation<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>D:/cassandra/data/callouts<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/CalloutLocation<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;BootstrapFileDirectory<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>D:/cassandra/data/bootstrap<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/BootstrapFileDirectory<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;StagingFileDirectory<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>D:/cassandra/data/staging<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/StagingFileDirectory<span style="color: #000000; font-weight: bold;">&gt;</span></span></span></pre></div></div>

<p>Let&#8217;s take Cassandra for a spin and check if she starts up correctly. For Mac OS, Linux, etc. users, simply change to the <tt>bin</tt> directory of Cassandra and run <tt>./cassandra</tt>. As an aside for the impatient, I start Cassanda with <tt>sudo</tt> to avoid trouble with the Cassandras <tt>system.log</tt>.</p>
<p>Windows users, however, that use the command line (meaning not <a href="http://www.cygwin.com/">Cygwin</a>) cannot start it just like that. The <tt>cassandra.bat</tt> didnt work for me on my Vista box if executed with <tt>bin</tt> being the current working directory (probably due to the CASSANDRA_HOME environment variable that get&#8217;s incorrectly set in the batch file). BUT it works perfect if you call <tt>bin\cassandra.bat</tt> from Cassandra&#8217;s main directory above <tt>bin</tt>. So if you are on Windows, change to the directory where you extracted Cassandra and execute <tt>bin\cassandra.bat</tt>.</p>
<p>Cassandras output on startup will look similar to this (here on Mac OS):</p>
<pre style="font-size:10px">
Schabbys-MacBook-Pro:bin johannes$ sudo ./cassandra
Schabbys-MacBook-Pro:bin johannes$ Listening for transport dt_socket at address: 8888
DEBUG - Loading settings from ./../conf/storage-conf.xml
DEBUG - Syncing log with a period of 1000
DEBUG - opening keyspace Keyspace1
DEBUG - adding Super1 as 0
DEBUG - adding Standard2 as 1
DEBUG - adding Standard1 as 2
DEBUG - adding StandardByUUID1 as 3
DEBUG - adding LocationInfo as 4
DEBUG - adding HintsColumnFamily as 5
DEBUG - opening keyspace system
DEBUG - INDEX LOAD TIME for /Users/johannes/cassandra/data/system/LocationInfo-1-Data.db: 0 ms.
DEBUG - INDEX LOAD TIME for /Users/johannes/cassandra/data/system/LocationInfo-2-Data.db: 0 ms.
DEBUG - INDEX LOAD TIME for /Users/johannes/cassandra/data/system/LocationInfo-3-Data.db: 0 ms.
INFO - Replaying /Users/johannes/cassandra/commitlog/CommitLog-1257980407451.log
DEBUG - Replaying /Users/johannes/cassandra/commitlog/CommitLog-1257980407451.log starting at 117
DEBUG - Reading mutation at 117
DEBUG - replaying mutation for system.L: {ColumnFamily(LocationInfo [Generation,])}
INFO - Flushing Memtable(LocationInfo)@228828460
DEBUG - Submitting LocationInfo for compaction
INFO - Completed flushing Memtable(LocationInfo)@228828460
INFO - Compacting [/Users/johannes/cassandra/data/system/LocationInfo-1-Data.db,/Users/johannes/cassandra/data/system/LocationInfo-2-Data.db,/Users/johannes/cassandra/data/system/LocationInfo-3-Data.db,/Users/johannes/cassandra/data/system/LocationInfo-4-Data.db]
DEBUG - index size for bloom filter calc for file  : /Users/johannes/cassandra/data/system/LocationInfo-1-Data.db   : 256
DEBUG - index size for bloom filter calc for file  : /Users/johannes/cassandra/data/system/LocationInfo-2-Data.db   : 512
DEBUG - index size for bloom filter calc for file  : /Users/johannes/cassandra/data/system/LocationInfo-3-Data.db   : 768
DEBUG - index size for bloom filter calc for file  : /Users/johannes/cassandra/data/system/LocationInfo-4-Data.db   : 1024
DEBUG - Expected bloom filter size : 1024
INFO - Compacted to /Users/johannes/cassandra/data/system/LocationInfo-5-Data.db.  0/255 bytes for 0/1 keys read/written.  Time: 150ms.
DEBUG - collecting Generation:false:4@3
DEBUG - collecting Token:false:16@0
INFO - Saved Token found: 160533723849634883377008460059010504450
DEBUG - Starting to listen on 127.0.0.1:7001
DEBUG - Binding thrift service to localhost:9160
</pre>
<p>I think that&#8217;s it. Leave a comment if you run in trouble or check the nice <a href="http://wiki.apache.org/cassandra/GettingStarted#if_something_goes_wrong">If Something Goes Wrong</a> page in the Cassandra Wiki.</p>
]]></content:encoded>
			<wfw:commentRss>http://schabby.de/cassandra-installation-configuration/feed/</wfw:commentRss>
		<slash:comments>12</slash:comments>
		</item>
		<item>
		<title>Cassandra &#8211; Getting Started</title>
		<link>http://schabby.de/cassandra-getting-started/</link>
		<comments>http://schabby.de/cassandra-getting-started/#comments</comments>
		<pubDate>Thu, 05 Nov 2009 13:58:11 +0000</pubDate>
		<dc:creator>schabby</dc:creator>
				<category><![CDATA[Java]]></category>
		<category><![CDATA[Cassandra]]></category>

		<guid isPermaLink="false">http://schabby.de/?p=733</guid>
		<description><![CDATA[This post addresses Java developers who want to get their feet wet with Cassandra. This is the first post in a series of three in which I describe Cassandras data model as seen from the angle of a typical Java developer. By contributing a javaish view on the data model, I try to extend the [...]]]></description>
			<content:encoded><![CDATA[<p>This post addresses Java developers who want to get their feet wet with Cassandra. This is the first post in a series of three in which I describe Cassandras data model as seen from the angle of a typical Java developer. By contributing a javaish view on the data model, I try to extend the set of existing data model descriptions.<br />
<span id="more-733"></span><br />
The second post in this series will briefly describe how to install and configure Cassandra. The third post will provide several hands-on examples for Cassandra with Java.</p>
<p>I have been toying around with <a href="http://incubator.apache.org/cassandra/">Cassandra </a>for quite some time now. From all NOSQL databases I have seen (and there are <a href="http://blog.oskarsson.nu/2009/06/nosql-debrief.html">quite a few</a> already as <a href="http://www.franzkowiak.org/">Michael</a> pointed out to me earlier), Cassandra seems to be the most promising one to me for reasons that are definitely worth discussing, but are here be beyond the scope of this post.</p>
<h2>Data Model</h2>
<p>Cassandras data model has been described <a href="http://wiki.apache.org/cassandra/DataModel">more</a> <a href="http://arin.me/code/wtf-is-a-supercolumn-cassandra-data-model">than</a> <a href="http://blog.evanweaver.com/articles/2009/07/06/up-and-running-with-cassandra/">once</a>. In contrast to the descriptions above, I will try to follow a more javaish view which I find easiest and most powerful to work with. I thereby start describing Cassandras data model as nested hash maps. </p>
<p>The way in which data get&#8217;s stored in key/value based databases like Cassandra strongly resembles the use of ordinary hash maps. To recall, hash maps store data for a (unique) key. The key is also later used to retrieve the data from the hash map. For example, in order to map string keys to byte arrays you would write in Java</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;">Map<span style="color: #339933;">&lt;</span>String, <span style="color: #000066; font-weight: bold;">byte</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">&gt;</span> map <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> HashMap<span style="color: #339933;">&lt;</span>String, <span style="color: #000066; font-weight: bold;">byte</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">&gt;</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></div></div>

<p>This principle stays the same with Cassandra. However, in Cassandra you do not have a single hash map but up to three layers of nested hash maps! What does the mean? Imagine you dont store your values in a single byte array for each key, but again in a hash map, like</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;">Map<span style="color: #339933;">&lt;</span>String, Map<span style="color: #339933;">&lt;</span>byte<span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span>, <span style="color: #000066; font-weight: bold;">byte</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">&gt;&gt;</span> map <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> HashMap<span style="color: #339933;">&lt;</span>String, Map<span style="color: #339933;">&lt;</span>byte<span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span>, <span style="color: #000066; font-weight: bold;">byte</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">&gt;&gt;</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></div></div>

<p>This way you would partition the data you want to store as key/value pairs that are first filled in the data hash map. The data hash map then gets inserted in the higher-order hash map for a given key string. Similarly, to retrieve a value, you would provide the key string and get the data hash map from which you would extract the value you are interested in. </p>
<p>Let us further assume that we dont want to store the key/value pairs as two individual values, but coupled in a class called &#8220;Column&#8221; so that our data model would look like this:</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;">Map<span style="color: #339933;">&lt;</span>String, Map<span style="color: #339933;">&lt;</span>byte<span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span>, Column<span style="color: #339933;">&gt;&gt;</span> map <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> HashMap<span style="color: #339933;">&lt;</span>String, Map<span style="color: #339933;">&lt;</span>byte<span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span>, Column<span style="color: #339933;">&gt;&gt;</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></div></div>

<p>Where <tt>Column</tt> is defined as:</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;"><span style="color: #000000; font-weight: bold;">class</span> Column <span style="color: #009900;">&#123;</span>
    <span style="color: #000066; font-weight: bold;">byte</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> name<span style="color: #339933;">;</span>
    <span style="color: #000066; font-weight: bold;">byte</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> value<span style="color: #339933;">;</span>
    <span style="color: #000066; font-weight: bold;">long</span> timestamp<span style="color: #339933;">;</span>
&nbsp;
    <span style="color: #000000; font-weight: bold;">public</span> Column<span style="color: #009900;">&#40;</span><span style="color: #000066; font-weight: bold;">byte</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> name, <span style="color: #000066; font-weight: bold;">byte</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> value<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
         <span style="color: #000000; font-weight: bold;">this</span>.<span style="color: #006633;">name</span> <span style="color: #339933;">=</span> name<span style="color: #339933;">;</span>
         <span style="color: #000000; font-weight: bold;">this</span>.<span style="color: #006633;">value</span> <span style="color: #339933;">=</span> value<span style="color: #339933;">;</span>
         <span style="color: #000000; font-weight: bold;">this</span>.<span style="color: #006633;">timestamp</span> <span style="color: #339933;">=</span> <span style="color: #003399;">System</span>.<span style="color: #006633;">currentMillis</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span></pre></div></div>

<p>This is already pretty close to what is called a <em>Column Family</em> in Cassandra. You need to restrain yourself from deriving something from the name &#8220;Column&#8221;. Also ignore <tt>timestamp</tt> which is used by Cassandra to avoid data inconsistency and which shall not bother us here.  </p>
<p>Before we go on, let us have a look on a concrete example on how you would need to work with this kind of data structure. Let us assume we want to store the profile data of a single user for some imaginary social networking website.</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;"><span style="color: #666666; font-style: italic;">/* data model to store user profiles */</span>
Map<span style="color: #339933;">&lt;</span>String, Map<span style="color: #339933;">&lt;</span>byte<span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span>, Column<span style="color: #339933;">&gt;&gt;</span> user <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> HashMap<span style="color: #339933;">&lt;</span>String, Map<span style="color: #339933;">&lt;</span>byte<span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span>, Column<span style="color: #339933;">&gt;&gt;</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #666666; font-style: italic;">/* create a user 'schabby' */</span>
user.<span style="color: #006633;">put</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;schabby&quot;</span>, <span style="color: #000000; font-weight: bold;">new</span> HashMap<span style="color: #339933;">&lt;</span>byte<span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span>, Column<span style="color: #339933;">&gt;</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #666666; font-style: italic;">/* fill in some profile data for user 'schabby' */</span>
Column age <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> Column<span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;age&quot;</span>.<span style="color: #006633;">toBytes</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span>, <span style="color: #000000; font-weight: bold;">new</span> <span style="color: #000066; font-weight: bold;">byte</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#123;</span> 27b <span style="color: #009900;">&#125;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
user.<span style="color: #006633;">get</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;schabby&quot;</span><span style="color: #009900;">&#41;</span>.<span style="color: #006633;">put</span><span style="color: #009900;">&#40;</span>age.<span style="color: #006633;">name</span>, age<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
Column realName <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> Column<span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;real name&quot;</span>.<span style="color: #006633;">toBytes</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span>, <span style="color: #0000ff;">&quot;Johannes Schaback&quot;</span>.<span style="color: #006633;">toBytes</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
user.<span style="color: #006633;">get</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;schabby&quot;</span><span style="color: #009900;">&#41;</span>.<span style="color: #006633;">put</span><span style="color: #009900;">&#40;</span>realName.<span style="color: #006633;">name</span>, realName<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
Column nationality <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> Column<span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;nationality&quot;</span>.<span style="color: #006633;">toBytes</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span>, <span style="color: #0000ff;">&quot;German&quot;</span>.<span style="color: #006633;">toBytes</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
user.<span style="color: #006633;">get</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;schabby&quot;</span><span style="color: #009900;">&#41;</span>.<span style="color: #006633;">put</span><span style="color: #009900;">&#40;</span>nationality.<span style="color: #006633;">name</span>, nationality<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></div></div>

<p>Again, do not get confused by the use of the byte arrays where normal string would make more sense. This is to resemble the Cassandra data model as close as possible. You will later realize that it&#8217;s actually quite nifty to keep the inner hash map byte based for the price of manually converting everything to byte arrays.</p>
<p>If we want to retrieve values from our data structure, we would need to do as follows:</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;"><span style="color: #000066; font-weight: bold;">byte</span> age <span style="color: #339933;">=</span> user.<span style="color: #006633;">get</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;schabby&quot;</span><span style="color: #009900;">&#41;</span>.<span style="color: #006633;">get</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;age&quot;</span>.<span style="color: #006633;">toBytes</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span>.<span style="color: #006633;">value</span><span style="color: #009900;">&#91;</span><span style="color: #cc66cc;">0</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
<span style="color: #003399;">String</span> realName  <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> <span style="color: #003399;">String</span><span style="color: #009900;">&#40;</span>user.<span style="color: #006633;">get</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;schabby&quot;</span><span style="color: #009900;">&#41;</span>.<span style="color: #006633;">get</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;real name&quot;</span>.<span style="color: #006633;">toBytes</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span>.<span style="color: #006633;">value</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #003399;">String</span> nationality<span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> <span style="color: #003399;">String</span><span style="color: #009900;">&#40;</span>user.<span style="color: #006633;">get</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;schabby&quot;</span><span style="color: #009900;">&#41;</span>.<span style="color: #006633;">get</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;nationality&quot;</span>.<span style="color: #006633;">toBytes</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span>.<span style="color: #006633;">value</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></div></div>

<p>And this is it. There is not much more conceptual stuff to understand in order to use Cassandra. So we are now ready to project this structure to Cassanda terminology.</p>
<h3>Column Family</h3>
<p>Cassandra structures its data model in <em>keyspaces</em>, <em>Column Families</em> (CF), <em>Columns</em> and <em>SuperColumns</em>.</p>
<p>A keyspace is a namespace to group Column Families and can be compared to a <em>schema</em> or single <em>database</em> in the SQL world. A keyspace contains one or more Column Families.</p>
<p>A Column Family can be seen as a multidimensional hash map like the one in our example above. In the SQL analogy, you may see a Column Family as a single table that belongs to a schema, however this comparison will not take you far. It is really more a dynamically growing and shrinking hash map rather than a table with fixed columns. Still, in Cassandras terminology you speak of <em>rows</em> when you refer to the hash map that you get for a key string. </p>
<p>Rows are accessed by string keys and each row &#8211; which can be seen as a &#8220;data hash map&#8221; &#8211; has several <em>columns</em>. Each column within a row is a bundled pair of a byte array key (a.k.a <tt>name</tt>) and its byte array data field (a.k.a. <tt>value</tt>) very similar to our example.</p>
<p>Depending on your configuration, you can let Cassandra apply a sorting scheme to impose an order over your columns in a row. This enables to query ranges over columns. For example, imagine a telephone book from which you want to retrieve all names starting with &#8220;Smi&#8221;. In Java terms, this could be compared to using <tt>SortedMap</tt> instead of <tt>Map</tt>. But we sticked to <tt>Map</tt> for simplicity here.</p>
<h3>SuperColumns</h3>
<p>The cool thing about Cassandra is its support for an additional hash map layer. This additional layer is added to the Column layer and enables you to store and access your data as a hash map in a hash map in a hash map, or in other words, as a three dimensional hash map. This additional hash map is called a SuperColumn (SC) </p>
<p>In our Java-like example, a Column Family with SuperColumns look like</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;">Map<span style="color: #339933;">&lt;</span>String, Map<span style="color: #339933;">&lt;</span>byte<span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span>, SuperColumn<span style="color: #339933;">&gt;&gt;</span> superColumn 
     <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> HashMap<span style="color: #339933;">&lt;</span>String, Map<span style="color: #339933;">&lt;</span>byte<span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span>, SuperColumn<span style="color: #339933;">&gt;&gt;</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></div></div>

<p>where <tt>SuperColumn</tt> is again a hash map over columns like</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;"><span style="color: #000000; font-weight: bold;">class</span> SuperColumn <span style="color: #000000; font-weight: bold;">extends</span> HashMap<span style="color: #339933;">&lt;</span>byte<span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span>, Column<span style="color: #339933;">&gt;</span>
<span style="color: #009900;">&#123;</span>
<span style="color: #009900;">&#125;</span></pre></div></div>

<p>Again, I want to point out that the actual SuperColumn definition in Cassandra is different and that this explanatory definition is not too accurate, but nicely serves the illustration purpose.</p>
<p>Similar to normal Columns, the values within a SuperColumn are also stored in an order depending on your configuration, enabling to cut out slices from your SuperColumns.</p>
<p>To continue our social networking site example, let us have a look on how SuperColumns are used to store the friend and relations of the user &#8216;schabby&#8217;.</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;"><span style="color: #666666; font-style: italic;">/* create ColumnFamily with SuperColumns */</span>
Map<span style="color: #339933;">&lt;</span>String, Map<span style="color: #339933;">&lt;</span>byte<span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span>, SuperColumn<span style="color: #339933;">&gt;&gt;</span> columnFamily <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> HashMap<span style="color: #339933;">&lt;</span>String, Map<span style="color: #339933;">&lt;</span>byte<span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span>, SuperColumn<span style="color: #339933;">&gt;&gt;</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #666666; font-style: italic;">/* prepare a SuperColumn for 'schabby' */</span>
columnFamily.<span style="color: #006633;">put</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;schabby&quot;</span>, <span style="color: #000000; font-weight: bold;">new</span> HashMap<span style="color: #339933;">&lt;</span>byte<span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span>, SuperColumn<span style="color: #339933;">&gt;</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #666666; font-style: italic;">/* create SC to store friend info */</span>
SuperColumn friends <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> SuperColumn<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #666666; font-style: italic;">/* fill in some friends */</span>
Column friend1 <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> Column<span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;friend_1&quot;</span>.<span style="color: #006633;">toBytes</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span>, <span style="color: #0000ff;">&quot;Merry&quot;</span>.<span style="color: #006633;">toBytes</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
friends.<span style="color: #006633;">put</span><span style="color: #009900;">&#40;</span>friend1.<span style="color: #006633;">name</span>, friend1<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
Column friend2 <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> Column<span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;friend_2&quot;</span>.<span style="color: #006633;">toBytes</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span>, <span style="color: #0000ff;">&quot;Robert&quot;</span>.<span style="color: #006633;">toBytes</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
friends.<span style="color: #006633;">put</span><span style="color: #009900;">&#40;</span>friend2.<span style="color: #006633;">name</span>, friend2<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
Column friend3 <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> Column<span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;friend_3&quot;</span>.<span style="color: #006633;">toBytes</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span>, <span style="color: #0000ff;">&quot;Susan&quot;</span>.<span style="color: #006633;">toBytes</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
friends.<span style="color: #006633;">put</span><span style="color: #009900;">&#40;</span>friend3.<span style="color: #006633;">name</span>, friend3<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #666666; font-style: italic;">/* finally store SC in Colunm Family */</span>
columnFamily.<span style="color: #006633;">get</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;schabby&quot;</span><span style="color: #009900;">&#41;</span>.<span style="color: #006633;">put</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;friends&quot;</span>.<span style="color: #006633;">toBytes</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span>, friends<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></div></div>

<p>We are free to create another SuperColumn in the same Column Family to store other list-like data for &#8216;schabby&#8217;, for example his inbox.</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;"><span style="color: #666666; font-style: italic;">/* ... continued example */</span>
&nbsp;
SuperColumn inbox <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> SuperColumn<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #666666; font-style: italic;">/* add two mails to inbox */</span>
Column mail1 <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> Column<span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;Hi Schabby&quot;</span>.<span style="color: #006633;">toBytes</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span>, <span style="color: #0000ff;">&quot;I hope you are well! Cheers, Nick&quot;</span>.<span style="color: #006633;">toBytes</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
inbox.<span style="color: #006633;">put</span><span style="color: #009900;">&#40;</span>mail1.<span style="color: #006633;">name</span>, mail1<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
Column mail2 <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> Column<span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;Welcome&quot;</span>.<span style="color: #006633;">toBytes</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span>, <span style="color: #0000ff;">&quot;some message body&quot;</span>.<span style="color: #006633;">toBytes</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
inbox.<span style="color: #006633;">put</span><span style="color: #009900;">&#40;</span>mail2.<span style="color: #006633;">name</span>, mail2<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
columnFamily.<span style="color: #006633;">get</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;schabby&quot;</span><span style="color: #009900;">&#41;</span>.<span style="color: #006633;">put</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;inbox&quot;</span>.<span style="color: #006633;">toBytes</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span>, inbox<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></div></div>

<p>Retrieving the mails from the inbox is straight forward:</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;"><span style="color: #666666; font-style: italic;">/* continued example */</span>
&nbsp;
SuperColumn inbox <span style="color: #339933;">=</span> columnFamily.<span style="color: #006633;">get</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;schabby&quot;</span><span style="color: #009900;">&#41;</span>.<span style="color: #006633;">get</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;inbox&quot;</span>.<span style="color: #006633;">toBytes</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #000000; font-weight: bold;">for</span><span style="color: #009900;">&#40;</span><span style="color: #000066; font-weight: bold;">byte</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> subject<span style="color: #339933;">:</span> inbox.<span style="color: #006633;">keySet</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span>
<span style="color: #009900;">&#123;</span>
   <span style="color: #003399;">String</span> body <span style="color: #339933;">=</span> inbox.<span style="color: #006633;">get</span><span style="color: #009900;">&#40;</span>subject<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
   <span style="color: #666666; font-style: italic;">// do something with subject/body</span>
<span style="color: #009900;">&#125;</span></pre></div></div>

<p>And this is it. I hope this enlightened your understanding of Cassandras data model. It&#8217;s not that difficult all in all, especially when you start using it.</p>
<p>Please leave some comments for corrections and feedback.</p>
]]></content:encoded>
			<wfw:commentRss>http://schabby.de/cassandra-getting-started/feed/</wfw:commentRss>
		<slash:comments>12</slash:comments>
		</item>
	</channel>
</rss>
