Curriculum
1 Introduction to Cassandra
Learning Objective:
Get introduced to Apache Cassandra and some of its design considerations and components and learn about various use cases of Cassandra.
-
Differences between NoSQL and RDBMS
-
Replication in RDBMS
-
Key Challenges with RDBMS
-
Schema
-
Advantage & Limitation
-
Key Characteristics of No SQL Data Base
-
Advantages of Cassandra
-
Where and when to use it?
-
Brewers CAP Theorem
-
Cassandra Key Features
-
Distributed and Decentralised
-
Elastic Scalability
-
High Availability and Fault Tolerance
-
Tuneable Consistency
-
Strict Consistency
-
Casual Consistency
-
Weak (Eventual Consistency)
-
Column Orientation
-
Column Orientation
-
Introduction to Cassandra
-
USE Cases for Cassandra
2 Getting Started with Cassandra
Learning Objective:
Install and configure Cassandra. Build your own local, single-node cluster. Learn about CCM with some basic commands with Cassandra’s nodetool.
-
Installation
-
Configuration
-
Starting Cassandra
-
Cassandra Cluster Manager
-
Introduction to the data model
-
Shutting down Cassandra
Hands-on:
-
Installation and configuration
-
Starting up and shutting down Cassandra
3 Cassandra Data Model
Learning Objective:
Learn to run Command-Line Client Interface, connect to a server. Also, learn about the relational data model, design differences between RDBMS and Cassandra.
-
Installation
-
Running the Command-Line Client Interface
-
Basic CLI Commands, Help
-
Connecting to a Server, Describing the Environment
-
Creating and Keyspace and Column Family
-
Writing and Reading Data
-
The Relational Data Model
-
Cluster
-
Keyspaces
-
What is Column oriented database
-
Column Families
-
Column Family Options
-
Columns
-
Wide Rows
-
Skinny Rows
-
Column Sorting
-
Super Columns
-
Composite Keys
-
Design Differences between RDBMS and CASSANDRA
-
Query Language
-
Referential Integrity
-
Secondary Indexes
-
Sorting, DeNormalisation
-
Design Patterns
-
Materialized Views
Hands-on:
-
Run Command- Line Client Interface. Read and write data.
4 Steps in Configuration
Learning Objective:
Learn to configure a data model.
-
Token calculation
-
Configuration overview
-
Node tool
-
Validators
-
Comparators
-
Expiring column
Hands-on:
-
Configure a data model using Token calculation, node tool, validators, comparators.
5 Cassandra Architecture
Learning Objective:
Learn about the concepts that influenced Cassandra’s design and use. Understand Brewer’s CAP theorem data-distribution and partitioning; Cassandra’s read and write paths; how data is stored on-disk; inner workings of components such as the snitch, tombstones, and failure-detection; and the delivered security features.
-
Cassandra’s ring architecture
-
Cassandra’s write path
-
Cassandra’s read path
-
On-disk storage
-
Additional components of Cassandra
Hands-on:
-
Problems that Cassandra was designed to solve
Cassandra’s read and write paths
The role that horizontal scaling plays
How data is stored on-disk
How Cassandra handles failure scenarios
6 Cassandra Query Language (CQL)
Learning Objective:
Learn about CQL, its syntax and usage and evolution as a language and comparing some of its capabilities to the well-known SQL of the relational database world.
-
Overview of Cassandra Data Modeling
-
cqlsh
-
Getting started with CQL
Hands-on:
-
Build primary keys that facilitate high-performing data models at scale
Use CQL syntax and solve different types of problems using it
7 Configuring a Cluster
Learning Objective:
Learn to start the cluster, examine its performance, make an adjustment, and test.
-
Evaluating instance requirement
-
Operating systems optimization
-
Configuring the JVM
-
Configuring Cassandra
Hands-on:
-
Sizing hardware and computer resources for Cassandra deployments
Operating system optimizations
Configuring the JVM
Configuring Cassandra
8 Performance Tuning
Learning Objective:
Learn about Cassandra-Stress and how to establish a performance baseline for a specific data model. Evaluate factors that can influence write performance. Understand read performance, and the different configuration properties that can help Apache Cassandra perform well during read-heavy and mixed workloads.
-
Cassandra Stress
-
Write performance
-
Read performance
-
Other performance considerations
Hands-on:
-
Using the Cassandra-Stress tool discover opportunities for improvement
Looking into situations to apply different table-compaction strategies
Examining Apache Cassandra’s cache and compression options
Improving upon the efficiency of the JVM
Optimizing network settings and configuration to avoid performance bottlenecks.
9 Managing a Cluster
Learning Objective:
Learn to scale your cluster horizontally, as well as to remove and replace failed nodes.
-
Add/Remove Nodes
-
Scaling Up
-
Scaling Down
-
Backing up and restoring data
-
Maintenance
Hands-on:
-
Adding and removing nodes
Working with logical data centers
Backups
Techniques for ensuring data consistency.
10 Monitoring
Learning Objective:
Learn about the wide variety of options available for monitoring and logging for Apache Cassandra, which will help in identifying issues proactively.
-
JMX interface
-
Node tool utility
-
Metric stack
-
Log stack
-
Troubleshooting
Hands-on:
Understand different monitoring and logging tools, and how they provide more insight for problem solving on your cluster.
Make decisions using reliable out-of-the-box applications from the open source community, including installing, configuring, analyzing, and setting up alerting.
11 Application Development
Learning Objective:
Learn the correct use cases and database selection. Discover the DataStax Java driver, its behaviors and configurations, and how it interacts with Apache Cassandra.
-
Common mistakes made at the application and data model levels
-
Driver selection
-
Appropriate connection properties
-
Handling simple and complex result sets in Java
-
Loading data without overwhelming your nodes
Hands-on:
-
Select Driver
-
Appropriate connection properties
-
Handling simple and complex result sets in Java
-
Loading data without overwhelming your nodes.
12 Integrating Cassandra with Apache Spark
Learning Objective:
Learn about Spark architecture, which stands on top among other sets of available tools; it offers ease of installation and a huge community, as well as backing up on Hadoop for data warehousing. Get to know the different ways of installation, along with a custom all-in-one Docker image, which has Apache Cassandra, a monitoring stack, and Spark including PySpark, SparkR, and Jupyter with their dependencies.
-
Spark (architecture, installation, and configuration)
-
PySpark
-
SparkR
-
Read, transform, and write
-
The Jupyter web interface
Hands-on:
-
Read, transform, and write
-
Work with Jupyter web interface.