Low Orbit Flux Logo 2 D


topic - category or feed
broker ( Kafka cluster )

each topic has a partitioned log

 ordered, immutable sequence of messages that is continually appended to—a commit log
     - message 
              - sequential id number called the offse

- data retention configurable, not based on consumption

- each partition:
          - leader and followers
          - distributed and replicated
- leaders can be elected
- each server can be a leader for some and follower for other partitions

producers select the topic and partition  ( round-robin, etc. )

consumer groups 
    - like a hybrid of queue/topic
    - message goes to one  consumer inctance in a consumer group
    - like a queue if: consumer instances have the same consumer group  ( balanced )
    - like a topic if: each instance has different group

- data is delivered to consumers in oder for each partition
- can't have more consumers than partitions

- for total ordering: 1 partition and 1 consumer

use cases:
    Website Activity Tracking
    Log Aggregation
    Stream Processing
    Event Sourcing
    Commit Log

tar -xzf kafka_2.10-
cd kafka_2.10-

bin/zookeeper-server-start.sh config/zookeeper.properties             # quick and dirty zookeeper packaged with kafka
bin/kafka-server-start.sh config/server.properties                    # start kafka

# create topic
bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic test 
# list topics
bin/kafka-topics.sh --list --zookeeper localhost:2181

# run producer, send messages
bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test
# run consumer, dump messages
bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic test --from-beginning

Cluster Setup:

cp config/server.properties config/server-1.properties 
cp config/server.properties config/server-2.properties

config/server-1.properties           # ids always unique.  port and log change because of shared host.

bin/kafka-server-start.sh config/server-1.properties &                 # start more brokers in the cluster
bin/kafka-server-start.sh config/server-2.properties &

# new replicated topic
bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 3 --partitions 1 --topic my-replicated-topic

bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic my-replicated-topic    # check new topic
bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic test                   # check old topic

leader - the leader
replicas - anything the topic is replcated onto
isr - is the set of "in-sync" replicas

bin/kafka-console-producer.sh --broker-list localhost:9092 --topic my-replicated-topic                 # produce
bin/kafka-console-consumer.sh --zookeeper localhost:2181 --from-beginning --topic my-replicated-topic  # consume
kill -9 7564                                                                                           # kill leader
bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic my-replicated-topic                  # check
bin/kafka-console-consumer.sh --zookeeper localhost:2181 --from-beginning --topic my-replicated-topic  # can still consume

NOTE - killing 2 out of 3 cluster nodes breaks the cluster
     - starting 1 back up for 2 out of 3, it will start working again 
     - messages while the cluster was down will be lost