Schabby's Blog
OpenGL, Java, Cassandra and other stuff that totally makes the world go round

This is the second post on my little "Cassandra - Getting Started" series covering the installation and basic configuration of Cassandra. Cassandra is extremely easy to set up, especially compared to HBase. All you got to do is to download, extract, edit a single XML-file and run. But let us take it step by step.

You can download Cassandra directly from it's (her?) website. At the time of the submission of this post, version 0.4.1 was the most recent stable. Note that you need Java 6 installed to run Cassandra which I assume here as properly installed.

After extracting Cassandra to some folder (on my Windows box I placed it directly in D:\cassandra), the only file you need to edit is conf/storage-conf.xml. While Cassandra is engineered to run on a large number of machines in a network, we start it here as a single node with the default parameter set, so that most of the settings are ok for now.

If your are not on a Unix-like system, you need to update the folders where Cassandra is supposed to store the data. If your using Windows (like me), then find the following lines in conf/storage-conf.xml and change the paths to something sensible

<CommitLogDirectory>/var/lib/cassandra/commitlog</CommitLogDirectory>
<DataFileDirectories>
      <DataFileDirectory>/var/lib/cassandra/data</DataFileDirectory>
</DataFileDirectories>
<CalloutLocation>/var/lib/cassandra/callouts</CalloutLocation>
<BootstrapFileDirectory>/var/lib/cassandra/bootstrap</BootstrapFileDirectory>
<StagingFileDirectory>/var/lib/cassandra/staging</StagingFileDirectory>

like for example my settings:

<CommitLogDirectory>D:/cassandra/data/commitlog</CommitLogDirectory>
<DataFileDirectories>
       <DataFileDirectory>D:/cassandra/data/data</DataFileDirectory>
</DataFileDirectories>
<CalloutLocation>D:/cassandra/data/callouts</CalloutLocation>
<BootstrapFileDirectory>D:/cassandra/data/bootstrap</BootstrapFileDirectory>
<StagingFileDirectory>D:/cassandra/data/staging</StagingFileDirectory>

Let's take Cassandra for a spin and check if she starts up correctly. For Mac OS, Linux, etc. users, simply change to the bin directory of Cassandra and run ./cassandra. As an aside for the impatient, I start Cassanda with sudo to avoid trouble with the Cassandras system.log.

Windows users, however, that use the command line (meaning not Cygwin) cannot start it just like that. The cassandra.bat didnt work for me on my Vista box if executed with bin being the current working directory (probably due to the CASSANDRA_HOME environment variable that get's incorrectly set in the batch file). BUT it works perfect if you call bin\cassandra.bat from Cassandra's main directory above bin. So if you are on Windows, change to the directory where you extracted Cassandra and execute bin\cassandra.bat.

Cassandras output on startup will look similar to this (here on Mac OS):

Schabbys-MacBook-Pro:bin johannes$ sudo ./cassandra
Schabbys-MacBook-Pro:bin johannes$ Listening for transport dt_socket at address: 8888
DEBUG - Loading settings from ./../conf/storage-conf.xml
DEBUG - Syncing log with a period of 1000
DEBUG - opening keyspace Keyspace1
DEBUG - adding Super1 as 0
DEBUG - adding Standard2 as 1
DEBUG - adding Standard1 as 2
DEBUG - adding StandardByUUID1 as 3
DEBUG - adding LocationInfo as 4
DEBUG - adding HintsColumnFamily as 5
DEBUG - opening keyspace system
DEBUG - INDEX LOAD TIME for /Users/johannes/cassandra/data/system/LocationInfo-1-Data.db: 0 ms.
DEBUG - INDEX LOAD TIME for /Users/johannes/cassandra/data/system/LocationInfo-2-Data.db: 0 ms.
DEBUG - INDEX LOAD TIME for /Users/johannes/cassandra/data/system/LocationInfo-3-Data.db: 0 ms.
INFO - Replaying /Users/johannes/cassandra/commitlog/CommitLog-1257980407451.log
DEBUG - Replaying /Users/johannes/cassandra/commitlog/CommitLog-1257980407451.log starting at 117
DEBUG - Reading mutation at 117
DEBUG - replaying mutation for system.L: {ColumnFamily(LocationInfo [Generation,])}
INFO - Flushing Memtable(LocationInfo)@228828460
DEBUG - Submitting LocationInfo for compaction
INFO - Completed flushing Memtable(LocationInfo)@228828460
INFO - Compacting [/Users/johannes/cassandra/data/system/LocationInfo-1-Data.db,/Users/johannes/cassandra/data/system/LocationInfo-2-Data.db,/Users/johannes/cassandra/data/system/LocationInfo-3-Data.db,/Users/johannes/cassandra/data/system/LocationInfo-4-Data.db]
DEBUG - index size for bloom filter calc for file  : /Users/johannes/cassandra/data/system/LocationInfo-1-Data.db   : 256
DEBUG - index size for bloom filter calc for file  : /Users/johannes/cassandra/data/system/LocationInfo-2-Data.db   : 512
DEBUG - index size for bloom filter calc for file  : /Users/johannes/cassandra/data/system/LocationInfo-3-Data.db   : 768
DEBUG - index size for bloom filter calc for file  : /Users/johannes/cassandra/data/system/LocationInfo-4-Data.db   : 1024
DEBUG - Expected bloom filter size : 1024
INFO - Compacted to /Users/johannes/cassandra/data/system/LocationInfo-5-Data.db.  0/255 bytes for 0/1 keys read/written.  Time: 150ms.
DEBUG - collecting Generation:false:4@3
DEBUG - collecting Token:false:16@0
INFO - Saved Token found: 160533723849634883377008460059010504450
DEBUG - Starting to listen on 127.0.0.1:7001
DEBUG - Binding thrift service to localhost:9160

I think that's it. Leave a comment if you run in trouble or check the nice If Something Goes Wrong page in the Cassandra Wiki.


Tags:

Trackbacks/Pingbacks

  1. Apache Cassandra « Java Tutorials
  2. Debian部署Cassandra指南

21 Antworten

  1. turbay says:

    C:\cassandra>bin\cassandra.bat

    C:\cassandra>t@REM
    't@REM' is not recognized as an internal or external command,
    operable program or batch file.
    Drive already SUBSTed
    Starting Cassandra Server
    Listening for transport dt_socket at address: 8888
    DEBUG - Loading settings from C:\cassandra\conf\storage-conf.xml
    DEBUG - Syncing log with a period of 1000
    DEBUG - opening keyspace Keyspace1
    DEBUG - adding Super1 as 0
    DEBUG - adding Standard2 as 1
    DEBUG - adding Standard1 as 2
    DEBUG - adding StandardByUUID1 as 3
    DEBUG - adding LocationInfo as 4
    DEBUG - adding HintsColumnFamily as 5
    DEBUG - Starting CFS Standard2
    DEBUG - Starting CFS Super1
    DEBUG - Starting CFS Standard1
    DEBUG - Starting CFS StandardByUUID1
    DEBUG - opening keyspace system
    DEBUG - Starting CFS LocationInfo
    DEBUG - INDEX LOAD TIME for C:\cassandra\data\data\system\LocationInfo-1-Data.db
    : 23 ms.
    DEBUG - INDEX LOAD TIME for C:\cassandra\data\data\system\LocationInfo-2-Data.db
    : 2 ms.
    DEBUG - Starting CFS HintsColumnFamily
    DEBUG - collecting Generation:false:4@1
    DEBUG - collecting Token:false:16@0
    INFO - Saved Token found: 84884484173418133679406654443742525516
    ERROR - Fatal exception in thread Thread[main,5,main]
    java.lang.AssertionError: 0:0:0:0:0:0:0:1
    at org.apache.cassandra.net.EndPoint.(EndPoint.java:64)
    at org.apache.cassandra.net.EndPoint.(EndPoint.java:49)
    at org.apache.cassandra.service.StorageService.start(StorageService.java
    :275)
    at org.apache.cassandra.service.CassandraServer.start(CassandraServer.ja
    va:72)
    at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.ja
    va:95)
    at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.jav
    a:167)
    INFO - LocationInfo has reached its threshold; switching in a fresh Memtable
    INFO - Enqueuing flush of Memtable(LocationInfo)@18129670
    INFO - Flushing Memtable(LocationInfo)@18129670
    DEBUG - discard completed log segments for CommitLogContext(file='C:/cassandra/d
    ata/commitlog\CommitLog-1262985925433.log', position=253), column family 4. CFID
    s are Keyspace1: TableMetadata(Standard2: 1, Super1: 0, Standard1: 2, StandardBy
    UUID1: 3, }), system: TableMetadata(LocationInfo: 4, HintsColumnFamily: 5, }), }

    DEBUG - Marking replay position 253 on commit log C:/cassandra/data/commitlog\Co
    mmitLog-1262985925433.log
    INFO - Completed flushing C:\cassandra\data\data\system\LocationInfo-3-Data.db

  2. Aatish says:

    hey,
    I am trying to install cassandra on windows machine. I followed your instructions and I am getting this:

    "C:\cassandra>bin\cassandra.bat
    Drive already SUBSTed
    Starting Cassandra Server
    The filename, directory name, or volume label syntax is incorrect."

    Can you tell me why am I getting this filename, directory name, vol label syntax is incorrect?

    Thanks,
    Aatish

  3. Aatish says:

    Hey, I got it!
    The reason I was getting that error "The filename, directory name, or volume label syntax is incorrect.” is because my JAVA_HOME had unnecessary ';' (read as semi-colon). So, when I removed it, that error went away.

  4. schabby says:

    Hi, great that you got it sorted out!

    Sorry for not answering sooner!

    Johannes

  5. Prakash says:

    Hi, I am getting this error when starting cassandra in windows...

    Please help.

    D:\cassandra>bin\cassandra
    Drive already SUBSTed
    The system cannot find the drive specified.
    Starting Cassandra Server
    Listening for transport dt_socket at address: 8888
    Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/cassandra/
    service/CassandraDaemon

  6. goodxp says:

    I would like to share the detailed steps to setup cassandra on windows - check my blog at
    "http://blog.csdn.net/goodxp".
    You can also see how to setup Ruby 1.9.1 with it there.
    Dirty fix but working.

  7. paul says:

    nice post! very useful!

  8. Andy says:

    Hi. Where is promised example? ;-)
    Here is nice cassandra client http://github.com/rantav/hector

  9. Rick says:

    Hi, i'm waiting the java example too :D
    can you email me if you publish something about it?

    is it thrift? something like... i describe my "schema" and then run thrift which provides java related APIs ?

    bye

  10. studleylee says:

    Worked like a charm on Vista 32bit. Thank you!
    I just added 2 lines in cassandra.bat to set my env variables:
    set JAVA_HOME=C:\Program Files\Java\jre6
    set CASSANDRA_HOME=c:\sand\cassandra

  11. mallikarjun says:

    Hi,

    I have a requirement of using Cassandra in my application. In my application there is one table with lot of data and most of my application uses that table. Due to lot of data,performance of the application is decreasing when i use that table is in Oracle.

    So, I have decided to use the Cassandra database for that one table and all other tables in oracle. Lot of business logic is dependent on that table.

    No my question is, Can I use the Cassandra for a table which has lot of business logic.

    I am unable to implement lot of where clauses for Cassandra database.

    Is there any supporting tool to use Cassandra in an efficient way?

    Please let me know...
    i am in urgency..

    Thanks in advance

    By Mallik

  12. schabby says:

    Hmm, hard to say. Cassandra may not be the silver bullet. Simple Reads/Writes are generally faster than on any comparable Sql database, but complicated queries need to be rewritten in your application logic, especially of you use higher order functions such as joins, grouping, sorting, etc.. And there is no tool that translates SQL queries to the according logic code. So you have to ponder each SQL query whether it easy to be migrated to cassandra.

  13. TS75 says:

    The best article again! Do you know when will you come up with the third post providing several hands-on examples for Cassandra with Java? I am really looking forward to the third one as I may have to use it in my project soon.

  14. schabby says:

    Hi! Thanks for your comment! I am afraid I am a bit snowed under with work so that we have to wait for the third part to get done. I am really sorry because I was really looking forward to continueing working on it :((

    Johannes

  15. VIK says:

    I am getting message as :-

    The system cannot find the specified path. Any ideeas.

  16. Amit says:

    Hi,

    the xml file you mentioned is not in the conf directory of Cassandra. So this post is outdated. Can you please update with the correct info, otherwise it is not very usefyl as of now.

  17. Rani says:

    I m getting error, how to solve it???
    INFO 12:40:16,765 JNA link failure, one or more native method will be unavailab
    le.

  18. Rani says:

    I got error in window xp
    INFO 12:40:16,765 JNA link failure, one or more native method will be unavailab
    le.how to solve it

  19. Ugur says:

    Downloading;

    Cassandrows (http://kimola.com/products/cassandrows)

    or

    Datastax Community Edition (http://www.datastax.com/products/community)

    is always another good option for Windows developers. Both of them installs, configures and runs Apache Cassandra as a Windows Service.

    So you will only need to configure cassandra.yaml for spesific settings.

Post Comment

Please notice: Comments are moderated by an Admin.