Distributed Event Processing Rule Engine with Storm, Spring and Groovy

Distributed Event Processing Rule Engine with Storm, Spring and Groovy

This post is about how to scale Business Rule engine to Big Data volume, processing huge number of events in distributed, scalable fashion. But firsts, let understand what is a business rule, rule engine and when we want to use one.
Business rule is a declarative statement which is managed separately from the code. It allow business users to change rules according to business need and do not rely on engineering team to code the rule. Usually a rule-based system consists of facts, a rule engine, and (of course) rules. The facts are merely data about the world. For example, user profile, user preferences, history of events like pages visited on the site are all facts. The rules are conditional statements that tell us what to do when the data satisfies certain conditions; in other words they’re equivalent to if-then programming clauses. The rule engine is responsible for executing the rules according to the facts.
rule
In a nutshell, rules can be viewed as

IF
CONDITIONS
THEN 
ACTION 

Example:
IF FlightPlan.distance_to_destination > 10 miles
AND FlightPlan.crewMembers < 3
AND Airport.name = JFK
THEN FlightPlan.approval = required

In our systems, rules are stored in a central repository and accessible by rule engine. Usually rule authoring consists of following steps:

  1. The IT staff develops a vocabulary of facts that provides basic building blocks to the business. Use of DSL (domain specific Language) in highly encouraged in this step. Some rule engine provide elaborated rule authoring modules.
  2. Equipped with this vocabulary, business analysts can reference facts and actions and leverage their domain-specific knowledge for expressing the business rules.
  3. Rules are deployed to runtime system either with or without involvement of IT staff.

When to Use a Rule Engine
The following criteria should be considered when deciding whether to use a rule engine:

  • Declarative– Rules are well-suited for expressing business logic in a human understandable language or graphical notation. Thus, it is easy for business analysts to capture the logic underlying the business’ operation. Changes can be implemented by non-IT staff.
  • Agility– As rules are not compiled but defined declaratively, they can then be modified dynamically at runtime. This allows business stakeholders to define and dynamically adapt and control the business rules. As a result, the business analyst can concentrate on his core competencies, take ownership of the fundamental decision steps and improve response time in the event of changing business requirements or other unforeseen events.
  • Logic and data separation– Business logic is abstracted into rules that are triggered by data (incoming facts) and potentially produce new data (actions).
  • Cost reduction– When the business rules are highly dynamic then a Business rule systems lowers cost to create or update the parts of applications that implement business policies. As rules can be updated by business analysts the human resources required to bring business policies into production can be reduced.
  • Versioning of rules– Rule management tools usually provide support for rule versioning to track the versioning of business rules and allow a rollback to older versions.

There are several open source and commercial  rule engines – one of the most prominent today is Drools. It implement Rete algorithm that allow it to perform quite well. But it’s quite complex piece of software that need to be understood and adapted.

Implementing Rule Engine with Groovy and Spring’s refreshable beans

You can implement Rule Engine using java, groovy and Spring framework. Spring is not necessary, but provides a lot of benefits, including nice IoC container to make out development effort much easier, integration with Rest API and databases via Spring Templates, good support for JPA, MongoDB, Redis via Spring Data and so on.

One can implement rule engine using java support for scripting languages.

For example, if you using Spring, since version 2.0, Spring has provided special capabilities for beans from dynamic languages like Groovy. One particularly interesting option is to deploy what are known as refreshable beans. For refreshable beans, rather than compile classes as usual, you deploy the actual source code and tell Spring where to find it and how often to check to see if it has changed.

Spring checks the source code at the end of each refresh interval, and if the file has been modified, it reloads the bean. This gives you the opportunity to change deployed classes even while the system is still running. Groovy source code in this case will implement our rules. Another option is to use more sophisticated GroovyScripting engine class to load rules and use Groovy metaprogramming capability to build a special DSL for your rule engine.

Deploying Rule Engine as Storm bolt

Now the the problem is how to deploy this rule engine so it can scale? One approach is to use Storm framework. It allows easy parallelization of rule execution among cluster of machine. In order for Spring application to work inside Storm component, we have  to instantiate Spring IoC framework inside a bolt.

In Storm bolt lifecycle, when bolt is initialized, prepare method is called. We can put Spring initialization code here. Assuming we have spring 3.2 without xml configuration and everything is annotated, then we can do following:

class RuleEngineBolt extends BaseRichBolt{
    private CampaignService campaignService;

    private transient OutputCollector outputCollector;

    @Override
    void prepare(Map map, TopologyContext topologyContext, OutputCollector outputCollector) {
        this.outputCollector = outputCollector;
        // initialize Spring IoC
        AnnotationConfigApplicationContext ctx = new AnnotationConfigApplicationContext();
        ctx.scan("com.foo");
        ctx.refresh();
        campaignService=ctx.getBean(CampaignService.class)
    }

Now, as events come by in a tuple, we want to execute our rule engine, where applyRuleOnEvent(event) will call our groovy file and execute.
@Override
    void execute(backtype.storm.tuple.Tuple tuple) {
        Event event=tuple.getValueByField("event")
        campaignService.applyRuleOnEvent(event);
    }

Overall Architecture:

This will allow to run rule engine inside Storm distributed framework.

Resources: Those books helped me to understand and develop Production Rule Engine and DSL

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>