Spring Boot Scheduler for Distributed System: Using shedlock

 When we want to execute something on a routine/scheduled basis, we need something which can automatically do such an operation if our app is running on the server. To achieve that, we can follow one of many ways. But I prefer Spring scheduler to do my scheduled job for its user-friendly use cases. 

So, for spring boot, it is very easy to set up a scheduler method. Let's configure. Please note that, in this writing, I will only mention the vital portion/code snippet for brevity. You will find the whole code linked to GitHub end of this writing.

So, what we need for this?

So far we don't need anything rather than basic libraries for spring boot app development.

Let's configure a Scheduler class: Schedular.java:

 @Component  
 @Log4j2  
 @EnableScheduling  
 public class Schedular {  
   @Scheduled(initialDelayString = "${initial.delay}", fixedDelayString = "${fixed.delay}")  
   public void scheduledJob()  
   {  
     log.info("*** my schedular goes here...: {}", Calendar.getInstance().getTime());  
      //our to dos operation/s here  
   }  
 }  

In our application.properties file, we have set initial delay and fixed delay for the scheduler as:

    initial.delay=10000
    fixed.delay=3000

which are in milliseconds. 

But the question is, what do they do?

initial.delay is the idle time your scheduler takes to start its first execution starting from app start time or from when the whole application is up and running.  And fixed.delay is the gap time between the last execution finish time and the next execution start time.
initialDelayString

Number of milliseconds to delay before the first execution of a fixedRate or fixedDelay task.

Returns:

the initial delay in milliseconds as a String value, e.g. a placeholder or a java.time.Duration compliant value

and

fixedDelayString

Execute the annotated method with a fixed period in milliseconds between invocations.

Returns:

the period in milliseconds


Let's run the app and see the output:



Check the time of the scheduling process and this will clear if you have any misconceptions about scheduling.

So, our first task is done, right? We can now schedule our job. 
-Great!

But what if we want to scale our system? What if we distribute app in a a cluster service? Our same code will be deployed in multiple server hence there will be schedulers of all the distributed pcs of their own. Guess what, all of them will be running based on their own machine. Like if we have 5 machine in our distributed server, we will get 5 schedulers running on them on their own time. But, did we want that at all?? We only wanted one scheduler
-No! we didn't. So, what is the solution then?

Well, I faced the same issue and came to know a solution. 
Using Shedlock(a third-party library) we can easily overcome this problem. You can check the Original source-Github here.

So, how do we solve our problem? 
They have mentioned various ways to solve using various tools. But I have chosen Postgres db here.

Let's create a database table:


 CREATE TABLE shedlock(name VARCHAR(64) NOT NULL, lock_until TIMESTAMP NOT NULL,  
 locked_at TIMESTAMP NOT NULL, locked_by VARCHAR(255) NOT NULL, PRIMARY KEY (name));  
 Add these libraries:
 // https://mvnrepository.com/artifact/net.javacrumbs.shedlock/shedlock-spring  
   implementation group: 'net.javacrumbs.shedlock', name: 'shedlock-spring', version: '4.19.1'  
   // https://mvnrepository.com/artifact/net.javacrumbs.shedlock/shedlock-provider-jdbc-template  
   implementation group: 'net.javacrumbs.shedlock', name: 'shedlock-provider-jdbc-template', version: '4.19.1'  
Let's configure our code:
In our postgres configuration: add this
 @Bean  
     public LockProvider lockProvider(DataSource dataSource)  
     {  
       return new JdbcTemplateLockProvider(  
           JdbcTemplateLockProvider.Configuration.builder()  
               .withJdbcTemplate(new JdbcTemplate(dataSource))  
               .usingDbTime() // Works on Postgres, MySQL, MariaDb, MS SQL, Oracle, DB2, HSQL and H2  
               .build()  
       );  
     }  
Scheduler.java:
 package com.example.pocshedlock;  
 import lombok.extern.log4j.Log4j2;  
 import net.javacrumbs.shedlock.core.LockAssert;  
 import net.javacrumbs.shedlock.spring.annotation.EnableSchedulerLock;  
 import net.javacrumbs.shedlock.spring.annotation.SchedulerLock;  
 import org.springframework.scheduling.annotation.EnableScheduling;  
 import org.springframework.scheduling.annotation.Scheduled;  
 import org.springframework.stereotype.Component;  
 import java.util.Calendar;  
 /**  
  * Created by DIPU on 3/9/21  
  */  
 @Component  
 @Log4j2  
 @EnableScheduling  
 @EnableSchedulerLock(defaultLockAtMostFor = "5m")  
 public class Schedular {  
   @Scheduled(initialDelayString = "${initial.delay}", fixedDelayString = "${fixed.delay}")  
   @SchedulerLock(name = "lock-user-name")  
   public void scheduledJob()  
   {  
     LockAssert.assertLocked();  
     log.info("*** my schedular goes here...: {}", Calendar.getInstance().getTime());  
   }  
 }  
Look at the schedulerLock attribute name. There will be an entry in the table we have created above using this name, meaning that
  1.  this user(scheduler name) is already in service
  2.  so any scheduler having the same name(as our code is delivered to multiple pcs) will check entry with their name. If that exists, it won't proceed otherwise, will execute the job. 
  3. So, any scheduler before executing any service, verifies if the service pool is occupied or not, wait and tries again.
This is how only a single scheduler executes the job in fixed delay and solves our problem.

Check the database automatic entry:



This is how we solve this problem

You will find source code on GitHub
Please let me know your feedback!

Comments

Popular posts from this blog

Java with MINIO file operations: upload, download, delete

Kafka Stream API: MySQL CDC to apache Kafka with debezium