cassandra
  1. cassandra-history

Cassandra Tutorial: History

Apache Cassandra is a highly scalable, high-performance, open-source distributed database system that provides linear scalability and fault tolerance. It was initially developed at Facebook and later open-sourced by Apache.

History

Cassandra was initially developed by Avinash Lakshman and Prashant Malik at Facebook for handling large amounts of inbound Facebook messaging data. They designed Cassandra to be a distributed system that could scale linearly with the addition of more nodes in a cluster. In 2008, Cassandra was released as an open-source project under the Apache License, and it became an Apache Incubator project in 2009. In 2010, Cassandra graduated from the Apache Incubator to become a top-level Apache project.

Syntax

Cassandra has its own Query Language called CQL (Cassandra Query Language). Below is the basic syntax of CQL:

CREATE KEYSPACE keyspace_name
WITH replication = {'class':'SimpleStrategy', 'replication_factor':3};

USE keyspace_name;

CREATE TABLE table_name(
   column1 datatype PRIMARY KEY,
   column2 datatype,
   column3 datatype,
   ...
);

INSERT INTO table_name (column1, column2, column3, ...)
VALUES (value1, value2, value3, ...);

SELECT * FROM table_name;

Example

Suppose we want to create a keyspace called example with a table called employee that contains data about employees in a company. We can use the following CQL commands to accomplish this:

CREATE KEYSPACE example
   WITH replication = {'class':'SimpleStrategy', 'replication_factor':3};

USE example;

CREATE TABLE employee (
   id int PRIMARY KEY,
   name text,
   age int,
   salary double
);

INSERT INTO employee (id, name, age, salary)
values (1, 'John Doe', 35, 75000.00);

SELECT * FROM employee;

Output

The output of the above example would be a table of employee data, displayed on the terminal or in a GUI-based query tool, such as the Cassandra Query Language shell (cqlsh):

 id | age | name     | salary
----+-----+----------+---------
  1 |  35 | John Doe | 75000.00

Explanation

The example above demonstrates how to create a keyspace, create a table, insert data into the table, and query the data using CQL commands.

First, we create a keyspace called example using the CREATE KEYSPACE command. We specify that the replication strategy is SimpleStrategy with a replication factor of 3. Next, we use the USE command to switch to the example keyspace.

Then, we create a table called employee with columns for id, name, age, and salary. We specify that the id column is the primary key using the PRIMARY KEY keyword.

After creating the table, we use the INSERT INTO command to insert data into the employee table. Finally, we use the SELECT command to query all the data from the employee table.

Use

Cassandra is used by many companies to handle large amounts of data, including user data, social media data, and machine-generated data. Cassandra's scalable architecture makes it useful for big data applications where horizontal scalability is key. It is also used as a primary or backup database for transactional and analytical applications.

Important Points

  • Cassandra is a highly scalable, high-performance, open-source distributed database system
  • Cassandra was initially developed by Facebook for handling large amounts of inbound messaging data
  • CQL (Cassandra Query Language) is used to query the database
  • Cassandra is fault-tolerant and provides linear scalability
  • Cassandra can tolerate node failures and network issues without losing data

Summary

In this tutorial, we learned about the history of Apache Cassandra, its architecture, the CQL query language, and how to create a keyspace and table, insert data into the table, and query the data. We also learned about the use cases and important points of Cassandra.

Published on: