Apache Druid : Complete Guide
Learn Druid Architecture, Kafka Ingestion, Schema Evolution, Tuning and Druid Hive Integration with Twitter example
Development ,Database and Design Development,Apache Spark
Lectures -21
Resources -8
Duration -2 hours
Lifetime Access
Lifetime Access
30-days Money-Back Guarantee
Get your team access to 10000+ top Tutorials Point courses anytime, anywhere.
Course Description
What do you learn from this course?
In this course, we learn end-to-end Apache druid salient features and integration with Apache Hive. We start this course by gaining theoretical knowledge on Druid and its key features.
Next, we jump to the practical part, where we install druid locally and walk you through its user portal. We change the druid metadata storage to MySQL and deep storage to S3 to enhance the druid setup. After that, we write our own Twitter Producer app, which pulls the tweets from Twitter in real time and pushes the tweets to Apache Kafka.
We create a Kafka ingestion task on Druid that pulls tweets from Kafka and stores them in Apache Druid. Also, we learn how to apply the transformation, filter, and schema configuration during the Kafka ingestion process.
Keeping practical knowledge in mind, we jump to the theory part and dig deeper into the druid internal working principle. We learn, how the data is distributed between the data nodes and retrieved in real-time. Next, we tune our ingestion pipeline to gain a better result. Lastly, we explore salient features like Accessing Druid through JDBC and Schema Evolution.
In the 2nd module, we talk about druid hive integration. At first, we learn what is this integration? Next, we provision a VM from AWS and install Apache Druid on it. After that, we acquire a hive EMR cluster from AWS and configure it such that it can communicate to druid easily. Lastly, we run the same druid queries on hive and learn how the computation is pushed down to druid for better performance.
Overall, this course is a composite of theory and practical sessions. Throughout this course, we use the latest druid and hive versions. At the end of this course, you will excel on Apache Druid.
Goals
- In-depth knowledge of Druid Components and its Architecture
- Real-time data ingestion from Apache Kafka using the Twitter Producer application
- Tuning Apache Druid for better throughput
- Accessing Apache Druid Tables through Avatica JDBC driver
- Learning Schema Evolution
- Complete Druid Hive Integration with hands-on experience
Prerequisites
- Basics of Apache Kafka, Apache Hive
- Practical experience on MySQL and AWS

Curriculum
Check out the detailed breakdown of what’s inside the course
Introduction
1 Lectures
-
Introduction 03:03 03:03
Apache Druid
15 Lectures

Druid Hive Integration
5 Lectures

Instructor Details

Ganesh Dhareshwar
Course Certificate
Use your certificate to make a career change or to advance in your current career.

Our students work
with the Best


































Related Video Courses
View MoreAnnual Membership
Become a valued member of Tutorials Point and enjoy unlimited access to our vast library of top-rated Video Courses
Subscribe now
Online Certifications
Master prominent technologies at full length and become a valued certified professional.
Explore Now