This is the second post on my little “Cassandra – Getting Started” series covering the installation and basic configuration of Cassandra. Cassandra is extremely easy to set up, especially compared to HBase. All you got to do is to download, extract, edit a single XML-file and run. But let us take it step by step.
You can download Cassandra directly from it’s (her?) website. At the time of the submission of this post, version 0.4.1 was the most recent stable. Note that you need Java 6 installed to run Cassandra which I assume here as properly installed.
After extracting Cassandra to some folder (on my Windows box I placed it directly in D:\cassandra), the only file you need to edit is conf/storage-conf.xml. While Cassandra is engineered to run on a large number of machines in a network, we start it here as a single node with the default parameter set, so that most of the settings are ok for now.
If your are not on a Unix-like system, you need to update the folders where Cassandra is supposed to store the data. If your using Windows (like me), then find the following lines in conf/storage-conf.xml and change the paths to something sensible
<CommitLogDirectory>/var/lib/cassandra/commitlog</CommitLogDirectory> <DataFileDirectories> <DataFileDirectory>/var/lib/cassandra/data</DataFileDirectory> </DataFileDirectories> <CalloutLocation>/var/lib/cassandra/callouts</CalloutLocation> <BootstrapFileDirectory>/var/lib/cassandra/bootstrap</BootstrapFileDirectory> <StagingFileDirectory>/var/lib/cassandra/staging</StagingFileDirectory>
like for example my settings:
<CommitLogDirectory>D:/cassandra/data/commitlog</CommitLogDirectory> <DataFileDirectories> <DataFileDirectory>D:/cassandra/data/data</DataFileDirectory> </DataFileDirectories> <CalloutLocation>D:/cassandra/data/callouts</CalloutLocation> <BootstrapFileDirectory>D:/cassandra/data/bootstrap</BootstrapFileDirectory> <StagingFileDirectory>D:/cassandra/data/staging</StagingFileDirectory>
Let’s take Cassandra for a spin and check if she starts up correctly. For Mac OS, Linux, etc. users, simply change to the bin directory of Cassandra and run ./cassandra. As an aside for the impatient, I start Cassanda with sudo to avoid trouble with the Cassandras system.log.
Windows users, however, that use the command line (meaning not Cygwin) cannot start it just like that. The cassandra.bat didnt work for me on my Vista box if executed with bin being the current working directory (probably due to the CASSANDRA_HOME environment variable that get’s incorrectly set in the batch file). BUT it works perfect if you call bin\cassandra.bat from Cassandra’s main directory above bin. So if you are on Windows, change to the directory where you extracted Cassandra and execute bin\cassandra.bat.
Cassandras output on startup will look similar to this (here on Mac OS):
Schabbys-MacBook-Pro:bin johannes$ sudo ./cassandra
Schabbys-MacBook-Pro:bin johannes$ Listening for transport dt_socket at address: 8888
DEBUG - Loading settings from ./../conf/storage-conf.xml
DEBUG - Syncing log with a period of 1000
DEBUG - opening keyspace Keyspace1
DEBUG - adding Super1 as 0
DEBUG - adding Standard2 as 1
DEBUG - adding Standard1 as 2
DEBUG - adding StandardByUUID1 as 3
DEBUG - adding LocationInfo as 4
DEBUG - adding HintsColumnFamily as 5
DEBUG - opening keyspace system
DEBUG - INDEX LOAD TIME for /Users/johannes/cassandra/data/system/LocationInfo-1-Data.db: 0 ms.
DEBUG - INDEX LOAD TIME for /Users/johannes/cassandra/data/system/LocationInfo-2-Data.db: 0 ms.
DEBUG - INDEX LOAD TIME for /Users/johannes/cassandra/data/system/LocationInfo-3-Data.db: 0 ms.
INFO - Replaying /Users/johannes/cassandra/commitlog/CommitLog-1257980407451.log
DEBUG - Replaying /Users/johannes/cassandra/commitlog/CommitLog-1257980407451.log starting at 117
DEBUG - Reading mutation at 117
DEBUG - replaying mutation for system.L: {ColumnFamily(LocationInfo [Generation,])}
INFO - Flushing Memtable(LocationInfo)@228828460
DEBUG - Submitting LocationInfo for compaction
INFO - Completed flushing Memtable(LocationInfo)@228828460
INFO - Compacting [/Users/johannes/cassandra/data/system/LocationInfo-1-Data.db,/Users/johannes/cassandra/data/system/LocationInfo-2-Data.db,/Users/johannes/cassandra/data/system/LocationInfo-3-Data.db,/Users/johannes/cassandra/data/system/LocationInfo-4-Data.db]
DEBUG - index size for bloom filter calc for file : /Users/johannes/cassandra/data/system/LocationInfo-1-Data.db : 256
DEBUG - index size for bloom filter calc for file : /Users/johannes/cassandra/data/system/LocationInfo-2-Data.db : 512
DEBUG - index size for bloom filter calc for file : /Users/johannes/cassandra/data/system/LocationInfo-3-Data.db : 768
DEBUG - index size for bloom filter calc for file : /Users/johannes/cassandra/data/system/LocationInfo-4-Data.db : 1024
DEBUG - Expected bloom filter size : 1024
INFO - Compacted to /Users/johannes/cassandra/data/system/LocationInfo-5-Data.db. 0/255 bytes for 0/1 keys read/written. Time: 150ms.
DEBUG - collecting Generation:false:4@3
DEBUG - collecting Token:false:16@0
INFO - Saved Token found: 160533723849634883377008460059010504450
DEBUG - Starting to listen on 127.0.0.1:7001
DEBUG - Binding thrift service to localhost:9160
I think that’s it. Leave a comment if you run in trouble or check the nice If Something Goes Wrong page in the Cassandra Wiki.
Tags: Cassandra

C:\cassandra>bin\cassandra.bat
C:\cassandra>t@REM
‘t@REM’ is not recognized as an internal or external command,
operable program or batch file.
Drive already SUBSTed
Starting Cassandra Server
Listening for transport dt_socket at address: 8888
DEBUG – Loading settings from C:\cassandra\conf\storage-conf.xml
DEBUG – Syncing log with a period of 1000
DEBUG – opening keyspace Keyspace1
DEBUG – adding Super1 as 0
DEBUG – adding Standard2 as 1
DEBUG – adding Standard1 as 2
DEBUG – adding StandardByUUID1 as 3
DEBUG – adding LocationInfo as 4
DEBUG – adding HintsColumnFamily as 5
DEBUG – Starting CFS Standard2
DEBUG – Starting CFS Super1
DEBUG – Starting CFS Standard1
DEBUG – Starting CFS StandardByUUID1
DEBUG – opening keyspace system
DEBUG – Starting CFS LocationInfo
DEBUG – INDEX LOAD TIME for C:\cassandra\data\data\system\LocationInfo-1-Data.db
: 23 ms.
DEBUG – INDEX LOAD TIME for C:\cassandra\data\data\system\LocationInfo-2-Data.db
: 2 ms.
DEBUG – Starting CFS HintsColumnFamily
DEBUG – collecting Generation:false:4@1
DEBUG – collecting Token:false:16@0
INFO – Saved Token found: 84884484173418133679406654443742525516
ERROR – Fatal exception in thread Thread[main,5,main]
java.lang.AssertionError: 0:0:0:0:0:0:0:1
at org.apache.cassandra.net.EndPoint.(EndPoint.java:64)
at org.apache.cassandra.net.EndPoint.(EndPoint.java:49)
at org.apache.cassandra.service.StorageService.start(StorageService.java
:275)
at org.apache.cassandra.service.CassandraServer.start(CassandraServer.ja
va:72)
at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.ja
va:95)
at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.jav
a:167)
INFO – LocationInfo has reached its threshold; switching in a fresh Memtable
INFO – Enqueuing flush of Memtable(LocationInfo)@18129670
INFO – Flushing Memtable(LocationInfo)@18129670
DEBUG – discard completed log segments for CommitLogContext(file=’C:/cassandra/d
ata/commitlog\CommitLog-1262985925433.log’, position=253), column family 4. CFID
s are Keyspace1: TableMetadata(Standard2: 1, Super1: 0, Standard1: 2, StandardBy
UUID1: 3, }), system: TableMetadata(LocationInfo: 4, HintsColumnFamily: 5, }), }
DEBUG – Marking replay position 253 on commit log C:/cassandra/data/commitlog\Co
mmitLog-1262985925433.log
INFO – Completed flushing C:\cassandra\data\data\system\LocationInfo-3-Data.db
hey,
I am trying to install cassandra on windows machine. I followed your instructions and I am getting this:
“C:\cassandra>bin\cassandra.bat
Drive already SUBSTed
Starting Cassandra Server
The filename, directory name, or volume label syntax is incorrect.”
Can you tell me why am I getting this filename, directory name, vol label syntax is incorrect?
Thanks,
Aatish
Hey, I got it!
The reason I was getting that error “The filename, directory name, or volume label syntax is incorrect.” is because my JAVA_HOME had unnecessary ‘;’ (read as semi-colon). So, when I removed it, that error went away.
Hi, great that you got it sorted out!
Sorry for not answering sooner!
Johannes
Hi, I am getting this error when starting cassandra in windows…
Please help.
D:\cassandra>bin\cassandra
Drive already SUBSTed
The system cannot find the drive specified.
Starting Cassandra Server
Listening for transport dt_socket at address: 8888
Exception in thread “main” java.lang.NoClassDefFoundError: org/apache/cassandra/
service/CassandraDaemon
nice post! very useful!