however in this lecture we will see types of groupings one of the most important thing that you need to do when designing a topology is to define how data is exchanged between the components like how streams are consumed by the bolts stream grouping specifies which strains are consumed by each bolt and how the streams will be consumed a stream grouping tells a topology how to send tuples between two components remember spouts and bolts execute in parallel as many tasks across the cluster if you look at how a topology is in executing in the task level it looks something like which you can see on the diagrams here so we have different types of groupings so the first one is shuffle grouping shuffle grouping is the most commonly used grouping shuffle grouping distributes tuples in a uniform random various the tasks an equal number of tuples will be processed by each task this grouping is ideal when you want to distribute your processing load uniformly across the task and where there is no requirement of any data-driven partitioning and the next one is all grouping all grouping is a special grouping that does not partition the tuples but sends a single copy of each tuple to all instances of the receiving bolts this kind of grouping is used to send signals to boards for example if you need to refresh a cache other example is if you are doing some kind of filtering on the streams then you have to pass the filter parameter to all the bolts this can be achieved by sending those parameters over a stream that is subscribed by all the bolts this will be executed and the next one is feel grouping feel grouping allows you to control how tuples are sent to boards based on one or more fields of the tupple it guarantees that a given set of values for a combination of fields is always sent to same board for example if you want that all the tweets from a particular user should go to a single task then you can partition the tweet stream using field grouping and the user name field so that all messages from that user will go to one bolt so in such scenarios field grouping will be used and the next one is global grouping global grouping since doubles generated by all the instances as a source to a single target instance specifically the task with lower ID so all the messages will be sent to a task with lower ID a general use case of this type is when there needs to be a reduced face in your topology where you want to combine results from previous step in the topology in a single bolt it’s like you know we are collecting data from all the bolts and if we want to sum it up in such scenarios we will use global grouping

Tags:






Youtube
Facebook
Google Plus
Twitter
TutorialDrive


Apache Zookeeper Tutorial

Apache Kafka Tutorial

Apache Kafka Security

Elasticserarch n Kibana

Java 8 Tutorial

Log4J Tutorial

Apache Storm Tutorial

SQLite Tutorial

Apache Ant Tutorial

Related Posts

blog

Apache Kafka Commands Cheat sheet

Spread the loveKafka Topics List existing topics bin/kafka-topics.sh –zookeeper localhost:2181 –list Purge a topic bin/kafka-topics.sh –zookeeper localhost:2181 –alter –topic mytopic –config retention.ms=1000 … wait a minute … bin/kafka-topics.sh –zookeeper localhost:2181 –alter –topic mytopic –delete-config retention.ms
Read more…

blog

What is Apache Maven | Apache Maven complete tutorial from scratch pdf

Spread the love In this post you will learn the complete tutorial of Apache Maven build tool What is Maven ? Apache Maven is a software project management and comprehension tool. Based on the concept
Read more…

blog

Practical Guide for Web Development in 2018

Spread the loveWelcome to my practical guide  for web development in 2018 in terms of  technology and career. Before we start I just want to  mention a few things, you don’t need to learn  everything that
Read more…