Low Orbit Flux Logo 2


http://kafka.apache.org/documentation.html
================================================


topic - category or feed
producer
consumer
broker ( Kafka cluster )


each topic has a partitioned log

 ordered, immutable sequence of messages that is continually appended to—a commit log
     - message 
              - sequential id number called the offse

- data retention configurable, not based on consumption

- each partition:
          - leader and followers
          - distributed and replicated
- leaders can be elected
- each server can be a leader for some and follower for other partitions


producers select the topic and partition  ( round-robin, etc. )


consumer groups 
    - like a hybrid of queue/topic
    - message goes to one  consumer inctance in a consumer group
    - like a queue if: consumer instances have the same consumer group  ( balanced )
    - like a topic if: each instance has different group


- data is delivered to consumers in oder for each partition
- can't have more consumers than partitions

- for total ordering: 1 partition and 1 consumer


use cases:
    Messaging
    Website Activity Tracking
    Metrics
    Log Aggregation
    Stream Processing
    Event Sourcing
    Commit Log




tar -xzf kafka_2.10-0.8.2.0.tgz
cd kafka_2.10-0.8.2.0

bin/zookeeper-server-start.sh config/zookeeper.properties             # quick and dirty zookeeper packaged with kafka
bin/kafka-server-start.sh config/server.properties                    # start kafka

# create topic
bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic test 
# list topics
bin/kafka-topics.sh --list --zookeeper localhost:2181


# run producer, send messages
bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test
# run consumer, dump messages
bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic test --from-beginning


Cluster Setup:
======================

cp config/server.properties config/server-1.properties 
cp config/server.properties config/server-2.properties



config/server-1.properties           # ids always unique.  port and log change because of shared host.
    broker.id=1
    port=9093
    log.dir=/tmp/kafka-logs-1
 
config/server-2.properties
    broker.id=2
    port=9094
    log.dir=/tmp/kafka-logs-2


bin/kafka-server-start.sh config/server-1.properties &                 # start more brokers in the cluster
bin/kafka-server-start.sh config/server-2.properties &

# new replicated topic
bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 3 --partitions 1 --topic my-replicated-topic


bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic my-replicated-topic    # check new topic
bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic test                   # check old topic


leader - the leader
replicas - anything the topic is replcated onto
isr - is the set of "in-sync" replicas


bin/kafka-console-producer.sh --broker-list localhost:9092 --topic my-replicated-topic                 # produce
bin/kafka-console-consumer.sh --zookeeper localhost:2181 --from-beginning --topic my-replicated-topic  # consume
kill -9 7564                                                                                           # kill leader
bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic my-replicated-topic                  # check
bin/kafka-console-consumer.sh --zookeeper localhost:2181 --from-beginning --topic my-replicated-topic  # can still consume



NOTE - killing 2 out of 3 cluster nodes breaks the cluster
     - starting 1 back up for 2 out of 3, it will start working again 
     - messages while the cluster was down will be lost