Skip to main content

Posts

Apache Cassandra Clients Classification

The below image describes the different clients of Cassandra and the underlying communication protocol used and documentation link to each  Basic Thrift https://wiki.apache.org/cassandra/ThriftExamples Astyanix https://github.com/Netflix/astyanax Hector https://hector-client.github.io/hector/build/html/documentation.html Cassnadra JDBC Driver: https://mvnrepository.com/artifact/org.apache-extras.cassandra-jdbc/cassandra-jdbc/1.2.1 Datastax Java driver Example : https://docs.datastax.com/en/developer/java-driver/3.3/manual/ Compatibility Chart : http://docs.datastax.com/en/developer/driver-matrix/doc/javaDrivers.html#java-drivers

Cassandra : Sub-Query Implementation and Cached prepared statement

Introduction: W hile familiarizing with Cassandra, I felt the unavailability of Sub-query is polluting my application by iterating over the first query result, and the next round trip is to fetch actual data. The main reason behind is the normalized approach that I have taken while designing column families. The RDBMS style of table design is not fair for non-structured NoSQL data stores. We can keep the reference table  contents  in a single table as separate columns. But some use cases will not allow as keeping everything in a single column family. (Especially when we consider the performance degradation caused by the compaction when the traffic to a single column family increased since compaction are per column family based).Also if we keep the index structure in a separate table without using the inbuilt secondary index provided by Cassandra. (Separate wide row implementation of index data will allow as to perform equal, range queries against the data), two queries have to be

Read In Cassandra : Explained

      Coordinator node The request initially goes to a coordinator node. Coordinator node is a node in the Cassandra cluster, which will act as a proxy for that request for the client. If all the replica nodes are alive based on the consistency level that user set with this request, Then the request will be propagated to the replica nodes, Otherwise, an unavailable exception will be thrown from the coordinator node.       Replica nodes The remaining read steps are same in all replica nodes,  For a single row request, it will use a QueryFilter class to pick the data from the Memtable and SStales that we are looking for.  If Row cache is enabled, it will look into the memory to get that row. The row will contain an entire row, which will be trimmed based on the need of the request. For a row cache hit, the replica node will respond back immediately to the coordinator node, else the SSTable will be searched.         SSTable Choosing algorithms If a ro