Hector (API)


Hector is a high-level client API for Apache Cassandra. Named after Hector, a warrior of Troy in Greek mythology, it is a substitute for the Cassandra Java Client, or Thrift, that is encapsulated by Hector. It also has Maven repository access.

History

As Cassandra is shipped with the low-level Thrift protocol, there was a potential to develop a better protocol for application developers. Hector was developed by Ran Tavory as a high-level interface that overlays the shortcomings of Thrift. It is licensed with the MIT License that allows to use, modify, split and change the design.

Features

The high-level features of Hector are
  • A high-level object oriented interface to Cassandra: It is mainly inspired by the Cassandra-java-client. The API is defined in the Keyspace interface.
  • Connection pooling. As in high-scale applications, the usual pattern for DAOs is a large number of reads/writes. It is too expensive for clients to open new connections with each request. So, a client may easily run out of available sockets, if it operates fast enough. Hector provides connection pooling and a nice framework that manages the details.
  • Failover support: As Cassandra is a distributed data store where hosts may go down. Hector has its own failover policy.
TypeComment
FAIL_FASTIf an error occurs, it fails
ON_FAIL_TRY_ONE_NEXT_AVAILABLETries one more host before giving up
ON_FAIL_TRY_ALL_AVAILABLETries all available hosts before giving up

  • JMX support: Hector exposes JMX for many important runtime metrics, such as number of available connections, idle connections, error statistics.
  • Load balancing: A simple load balancing exists in the newer version.
  • Supports the command design pattern to allow clients to concentrate on their business logic and let Hector take care of the required plumbing.

    Availability metrics

Hector exposes availability counters and statistics through JMX.

Load balancing

Hector follows two load balancing policies with the LoadBalancingPolicy interface. The default is called RoundRobinBalancingPolicy and is a simple round-robin distribution algorithm. The LeastActiveBalancingPolicy routes requests to the pools having the lowest number of active connections, ensuring a good spread of utilisation across the cluster..

Pooling

The ExhaustedPolicy determines how the underlying client connection pools are controlled. Currently, three options are available:
TypeComment
WHEN_EXHAUSTED_FAILFails acquisition when no more clients are available
WHEN_EXHAUSTED_GROWThe pool is automatically increased to react to load increases
WHEN_EXHAUSTED_BLOCKBlock on acquisition until a client becomes available

Code examples

As an example, an implementation of a simple distributed hashtable over Cassandra is listed.

/**
* Insert a new value keyed by key
* @param key Key for the value
* @param value the String value to insert
*/
public void insert throws Exception
/**
* Get a string value.
* @return The string value; null if no value exists for the given key.
*/
public String get throws Exception
/**
* Delete a key from cassandra
*/
public void delete throws Exception