Apache Kafka

1 Contributor
Last updated 09 Mar 11:45

TL;DR

Apache Kafka is an open-source event streaming platform created by LinkedIn in 2011 which initially served as a messaging queue.

What Is a Apache Kafka

Apache Kafka is used as a high-available messaging queue. It receives messages from other services in the environment and provides it to the others.

Kafka is commonly deployed as a cluster with 3 or more brokers (nodes) to have data replicas (backups) on other brokers.
Kafka receives messages from producers and provides them to consumers. Each message is saved to a topic that has a name.
The message can be a text, number or an object, depending on the implementation. The topic is a category name for messages.
Producers write messages to topics and consumers read messages from topics. Kafka retains all messages for a specific time and consumers are responsible to track
location of these messages. Kafka topics are divided into a number of partitions, which contains messages in an unchangeable sequence.
Partition is a section that is separated from other segments and enables users to divide data into logical sections. Each message in a partition has a specific offset.

Apache Kafka
Source: Apache Kafka

Kafka uses Zookeeper as a centralized service for maintaining configuration information,
naming, providing distributed synchronization, and providing group services. When new brokers are added to the cluster, ZooKeeper will start utilizing them by creating topics and partitions.

Why You Might Want to Implement Apache Kafka

Kafka helps you to move large amounts of data in a reliable way and is a very flexible tool for communication between services. It's possible to scale Kafka easily and it ensures that data are read just once.

Advantages of Kafka:

  • High-Throughput
  • Fault-Tolerant
  • Durability
  • High Concurrency
  • Real-time Handling
  • Scalability
  • Low Latency
  • By Default Persistent

Problems the Apache Kafka Helps to Solve

Microservices architecture without Kafka

Kafka Microservices Mashup
Source: Confluent: Apache Kafka vs. Enterprise Service Bus (ESB) – Friends, Enemies or Frenemies?

Microservices architecture with Kafka

Kafka Microservices
Source: Confluent: Apache Kafka vs. Enterprise Service Bus (ESB) – Friends, Enemies or Frenemies?

How to Implement Apache Kafka

It's necessary to have deployed an Apache Kafka cluster including Zookeeper clues to manage Kafka nodes. There are several libraries for programming languages to connect Kafka easily.

Common Pitfalls of the Apache Kafka

  • Keeping too much data
  • Old Data in Topics Not Being Deleted
  • Not balancing topics
  • Not accounting for long-term storage
  • No disaster recovery
  • No API enforcement

Resources for the Apache Kafka