1.2 Billion Records Per Hour High Performance Kafka and Spark - End to End Data Engineering Project
Автор: CodeWithYu
Загружено: 2024-12-03
Просмотров: 18939
Описание:
PART 2: • End to End Monitoring of High Performance ...
Ever wondered how to process 1 billion records per hour seamlessly? In this video, we break down the architecture and tools to make it happen:
✅ Apache Kafka: The backbone of real-time data streaming.
✅ Apache Spark: Lightning-fast processing for massive data pipelines.
✅ ELK Stack: Gain visibility with Elasticsearch, Logstash, and Kibana.
✅ Grafana & Prometheus: Real-time monitoring and performance insights.
✅ Kafka Schema Registry & Control Center: Streamlined management and schema validation.
🎯 What You'll Learn:
✅ How to design a robust architecture for high-throughput data pipelines.
✅ Insights into Python vs. Java Kafka Producers: Which one performs better?
✅ Real-time logging, monitoring, and debugging strategies.
🔥 Why This Matters: If you're in data engineering or want to level up your skills, this video showcases everything you need to build, monitor, and scale an ultra-high-performance streaming platform.
Timestamps:
0:00 Introduction
2:31 High Level Architecture Whiteboard
12:55 Data Storage Estimation with workings!
29:33 Clean Architecture
30:39 System Architecture
36:27 System Architecture Setup and Coding
58:21 Python Producer 😩
1:29:27 Java Producer (yay! 😁)
1:33:17 300,000 records per second!
1:36:21 Apache Spark Consumer
2:03:50 Spark Job Optimisation and Statistics
2:15:26 Cluster Health issues
2:15:38 Part 1 Outro
👀 Don't just watch, build it! 🚧
👍 Like, Comment, & Subscribe for more cutting-edge data engineering content!
Resources:
Full Source Code:
https://buymeacoffee.com/yusuf.ganiyu...
Kafka Documentation: https://kafka.apache.org/documentation/
Apache Spark Documentation: https://spark.apache.org/documentatio...
#ApacheKafka, #ApacheSpark, #DataEngineering, #BigData, #RealTimeProcessing, #ELKStack, #Grafana, #Prometheus, #KafkaStreams, #BigDataAnalytics, #DataPipeline, #StreamingData, #KafkaMonitoring, #SparkStreaming, #DataArchitecture, #HighPerformanceComputing
Повторяем попытку...
Доступные форматы для скачивания:
Скачать видео
-
Информация по загрузке: