Cassandra Tutorial: History
Apache Cassandra is a highly scalable, high-performance, open-source distributed database system that provides linear scalability and fault tolerance. It was initially developed at Facebook and later open-sourced by Apache.
History
Cassandra was initially developed by Avinash Lakshman and Prashant Malik at Facebook for handling large amounts of inbound Facebook messaging data. They designed Cassandra to be a distributed system that could scale linearly with the addition of more nodes in a cluster. In 2008, Cassandra was released as an open-source project under the Apache License, and it became an Apache Incubator project in 2009. In 2010, Cassandra graduated from the Apache Incubator to become a top-level Apache project.
Syntax
Cassandra has its own Query Language called CQL (Cassandra Query Language). Below is the basic syntax of CQL:
CREATE KEYSPACE keyspace_name
WITH replication = {'class':'SimpleStrategy', 'replication_factor':3};
USE keyspace_name;
CREATE TABLE table_name(
column1 datatype PRIMARY KEY,
column2 datatype,
column3 datatype,
...
);
INSERT INTO table_name (column1, column2, column3, ...)
VALUES (value1, value2, value3, ...);
SELECT * FROM table_name;
Example
Suppose we want to create a keyspace called example
with a table called employee
that contains data about employees in a company. We can use the following CQL commands to accomplish this:
CREATE KEYSPACE example
WITH replication = {'class':'SimpleStrategy', 'replication_factor':3};
USE example;
CREATE TABLE employee (
id int PRIMARY KEY,
name text,
age int,
salary double
);
INSERT INTO employee (id, name, age, salary)
values (1, 'John Doe', 35, 75000.00);
SELECT * FROM employee;
Output
The output of the above example would be a table of employee data, displayed on the terminal or in a GUI-based query tool, such as the Cassandra Query Language shell (cqlsh):
id | age | name | salary
----+-----+----------+---------
1 | 35 | John Doe | 75000.00
Explanation
The example above demonstrates how to create a keyspace, create a table, insert data into the table, and query the data using CQL commands.
First, we create a keyspace called example
using the CREATE KEYSPACE
command. We specify that the replication strategy is SimpleStrategy
with a replication factor of 3. Next, we use the USE
command to switch to the example
keyspace.
Then, we create a table called employee
with columns for id
, name
, age
, and salary
. We specify that the id
column is the primary key using the PRIMARY KEY
keyword.
After creating the table, we use the INSERT INTO
command to insert data into the employee
table. Finally, we use the SELECT
command to query all the data from the employee
table.
Use
Cassandra is used by many companies to handle large amounts of data, including user data, social media data, and machine-generated data. Cassandra's scalable architecture makes it useful for big data applications where horizontal scalability is key. It is also used as a primary or backup database for transactional and analytical applications.
Important Points
- Cassandra is a highly scalable, high-performance, open-source distributed database system
- Cassandra was initially developed by Facebook for handling large amounts of inbound messaging data
- CQL (Cassandra Query Language) is used to query the database
- Cassandra is fault-tolerant and provides linear scalability
- Cassandra can tolerate node failures and network issues without losing data
Summary
In this tutorial, we learned about the history of Apache Cassandra, its architecture, the CQL query language, and how to create a keyspace and table, insert data into the table, and query the data. We also learned about the use cases and important points of Cassandra.