Bullet Cache - The C API tutorial

This post in the Bullet Cache series introduces the primary API to the cache server, implemented in the for of a C library. This is important as it allows for maximum portability across different application environments while at the same time retaining maximum efficiency and performance. All of the standard benchmarks were done using this exact library without any tweaking. The standard distribution also includes the PHP API which is an almost exact wrapper around the C API. Both of these APIs are also comprehensively described in the Bullet Cache User Guide.


Previous posts from the Bullet Cache series include:

The C API is designed to be easily used from client applications and to be consistent in its nomenclature, definitions and function prototypes. It supports all of the capabilities of the Bullet Cache server. The C API is also the only official starting point for writing client libraries or wrappers to the Bullet Cache protocol, as the protocol itself is designed to be compact, efficient and machine-dependant (in other words, it's a pure binary protocol). The API is described in the Bullet Cache User Guide, and this blog post supplements the Guide with a short tutorial.

The first step in accessing the Bullet Cache through this library is establishing the connection to the server. Depending on the system setup, this may be through a local Unix socket or via TCP to a remote computer system. The choice between the two is less obvious than it seems since it involves operating system scalability considerations, but as a rule of thumb, the local Unix socket can be orders of magnitude faster than TCP.

In any case, the server connection is encapsulated in a mc_connection structure, which is created by calling either mc_connect_local() for a Unix socket connection or mc_connect_tcp() for a remote TCP connection. The code usually looks something like this:

#include <mc_protocol.h>
#include <mc_client.h>

struct mc_connection *conn;
char *err;

if (use_unix_socket)
conn = mc_connect_local(MC_DEFAULT_UNIX_SOCKET, TRUE, &err);
else
conn = mc_connect_tcp(server_host, MC_DEFAULT_INET_PORT, TRUE, &err);

if (conn == NULL) {
printf("Connection failed: %s", err);
free(err);
}

The returned pointer (conn) will either be a valid mc_connection structure or NULL if an error is encountered. This example also introduces the error reporting convention used by the library: there are usually two values involved: the first one is returned by the called function - it is most commonly an integer containing one of MC_RESULT_* constants - and the other is the error description string which contains a human-readable description of what happened. This error string is allocated by the API function itself and should be freed by the caller. The next-to-last argument to the mc_connect_* functions specifies if the client will also do a simple hendhake operation with the server to test that both speak the same protocol version - this step may be omitted (and thus speed up the process) if it is certain that both are in sync.

The next most useful APIs are the ones which put and retrieve a simple record to and from the Bullet Cache server. In an example, they look like this:

int res;

res = mc_put_simple(conn, key_string, key_len, value_string, value_len, 120, &err);
if (res != MC_RESULT_OK)
errx(1, "Error putting record: %s", err);

res = mc_get_simple(conn, key_string, key_len, &value_string, &value_len, &err);
if (res != MC_RESULT_OK)
errx(1, "Error putting record: %s", err);

mc_del_simple(conn, key_string, key_len, &err);

Bullet Cache records contain (among other things), a key string, its length, a value string, its length, the record expiry time and record tags (discussed later).

Both the key and the value in the Bullet Cache are completely opaque binary strings. They can contain any data pertinent to the application which uses them, and there are no arbitrary rules such as NUL-termination for strings. If a string needs to be NUL-terminated, it must include the \0 character and its length must account for this additional byte. The string length limits are liberal: the key string may be upto 64 KiB in length and the value string may be upto 2 GiB in length. However, both should be kept short for reasons of simple efficiency. The expiry time is given in seconds (0 is infinity), and it is the numeric argument to the mc_put_simple() example call.

The next most useful APIs are the Multiple record APIs. They are very similar to the Simple record APIs but instead operate on multiple records in a single operation - i.e. multiple records may be put, retrieved or deleted all at once. This is a good oportunity to introduce the Bullet Cache consistency model.

A real-world constraint is that certain data must be operated on as a group. They may be related and e.g. need to be put and retrieved together, or there may be a need to operate on batches of records for efficiency reasons. Additionally, there may be a constraint that these operations be "atomic" with regards to bulk processing of the records: that either all records be processed or neither, and that the records will not change arbitrarily during the operation. Bullet Cache supports this and all multiple record operations can optionally be performed atomically.

Since C is a relatively low-level language which doesn't directly support dynamic data structures, composing the input data to these operations (as well as parsing the results) is a bit more convoluted, as the example shows:

/* Build a multiple-record structure */
struct mc_data_entry *records[2];
mc_build_data_entry(&records[0], "rec0", 5, "data0", 6, NULL, 0, 10, &mc_client_seq);
mc_build_data_entry(&records[1], "rec1", 5, "data1", 6, NULL, 0, 10, &mc_client_seq);

/* Execute MPUT */
if (mc_mput(conn, 2, records, 0, &err) != MC_RESULT_OK)
errx(1, "mc_mput: %s", err);
for (i = 0; i < 2; i++)
free(records[i]);

/* Build multiple-key retrieval structures */
uint16_t key_lengths[2];
char *keys[2];
struct mc_multidata_result *mdres;

key_lengths[0] = key_lengths[1] = 5;
keys[0] = "rec0";
keys[1] = "rec1";

/* Execute MGET */
if (mc_mget(conn, 2, key_lengths, keys, 0, &mdres, &err) != MC_RESULT_OK)
errx(1, "mc_mget: %s", err);
if (mdres->n_records != 2)
errx(1, "n_records after mc_mget mismatch (%d)", mdres->n_records);

/* Parse the result */
int i;
struct mc_data_entry *rp;

for (i = 0; i < mdres->n_records; i++) {
rp = mdres->records[i];
printf("key: %s, value: %s\n", mc_data_entry_key(rp), mc_data_entry_value(rp));
}
mc_multidata_result_free(mdres);

/* Execute MDEL - delete the records */
mc_mdel(conn, 2, key_lengths, keys, 0, &i, &err);

The key here is to use the helper functions provided by the API to hide the structure complexity. The mc_build_data_entry() function will compose a record from the given arguments (allocating it on the go), and can be used to create multiple records which will be passed to mc_mput() together.

The mc_mget() API accepts an array of strings and an array of string lengths, to form the keys of the records which are to be requested from the server. The function initializes and returns a mc_multidata_result structure (which should be freed by the mc_multidata_result_free() API) containg retrieved records. This result may or may not contain all requested records, depending on if they are present on the server or not, and it can return them in any order. Finally, mc_data_entry_*() functions can inspect a mc_data_entry structure and return pointers to specific elements of the structure. All these functions accept a flags argument (just in front of the result argument) which may contain the MCMD_FLAG_ATOMICALL flag if the operations are to be performed atomically.

These are the most basic functions implemented by the Bullet Cache client API - and the ones most similar to other cache servers out there - but there are whole other sections of the API which are more or less unique to Bullet Cache: the Tag API and the Atomic record API - which require additional discussion of Bullet Cache capabilities and will be described in future posts.

 

Comments !

blogroll

social