Tuesday, 25 October 2016

Spring Cloud Microservice : Architectural Overview


This blogpost explains a common Microservice Architecture using Spring Cloud Microservice. We will primarily focus on different components Spring Cloud Microservice uses for implementing a Microservice architecture using Spring Boot. The code examples will be covered in a separate blogpost.

Spring Cloud Microservice Architecture
                   

Client : The clients are any application client interested to consume the exposed microservices.

API Gateway: The API Gateway can handle a request by invoking multiple microservices and aggregating the results. It can translate between web protocols such as HTTP and WebSocket and web‑unfriendly protocols that are used internally. Zuul is a JVM based router and server side load balancer by Netflix, Spring Cloud has a nice integration with an embedded Zuul proxy, which can accept request from client and route it to different server and aggregate the result.

Config Service:  Spring Cloud Config provides server and client-side support for externalised configuration in a distributed system. With the Config Server we have a central place to manage external properties for applications across all environments. The concepts on both client and server map identically to the Spring Environment and PropertySource abstractions, so they fit very well with Spring applications, but can be used with any application running in any language. As an application moves through the deployment pipeline from dev to test and into production we can manage the configuration between those environments and be certain that applications have everything they need to run when they migrate.

Service Registration and Discovery (Netflix Eureka): The microservice style of architecture is not so much about building individual services so much as it is making the interactions between services reliable and failure-tolerant.A service registry is a phone book for your microservices. Each service registers itself with the service registry and tells the registry where it lives (host, port, node name) and perhaps other service-specific metadata - things that other services can use to make informed decisions about it.

There are several popular options for service registries. Netflix built and then open-sourced their own service registry, Eureka. Another new, but increasingly popular option is Consul Spring Cloud supports both Eureka and Consul.

Circuit Breaker (Hystrix): Circuit breaker is a design pattern in modern software development. Circuit breaker is used to detect failures and encapsulates logic of preventing a failure to reoccur constantly (during maintenance, temporary external system failure or unexpected system difficulties).

Netflix’s Hystrix library provides an implementation of the Circuit Breaker pattern: when we apply a circuit breaker to a method, Hystrix watches for failing calls to that method, and if failures build up to a threshold, Hystrix opens the circuit so that subsequent calls automatically fail. While the circuit is open, Hystrix redirects calls to the method, and they’re passed on to our specified fallback method.

Ribbon: Ribbon is a client side IPC library that is battle-tested in cloud. It provides the following features
  • Load balancing
  • Fault tolerance
  • Multiple protocol (HTTP, TCP, UDP) support in an asynchronous and reactive model
  • Caching and batching

We build a microservice application that uses Netflix Ribbon and Spring Cloud Netflix to provide client-side load balancing in calls to another microservice.

We will see some code examples using Spring Boot to explore and integrate all these components  in subsequent posts.

[Code Example is available here :- https://github.com/badalb/spring-cloud-microservices]

Spring Cloud Config : Externalise your configuration and avoid server restart !!! code example

Spring Cloud Config provides server and client-side support for externalised configuration in a distributed system. With the Config Server we have a central place to manage external properties for applications across all environments. The concepts on both client and server map identically to the Spring Environment and PropertySource abstractions, so they fit very well with Spring applications, but can be used with any application running in any language. As an application moves through the deployment pipeline from dev to test and into production we can manage the configuration between those environments and be certain that applications have everything they need to run when they migrate.

In the example below we will see how we can externalise and manage configuration from github repository and file system.

Spring Cloud Config Server

The Server provides an HTTP, resource-based API for external configuration. We can create/start a config server with @EnableConfigServer annotation at the Spring Boot application startup class.

Code:

@EnableConfigServer
@SpringBootApplication
public class ConfigServiceApplication {
    
    public static void main(String[] args) {
        SpringApplication.run(ConfigServiceApplication.class, args);
    }
}

application.properties file:
server.port=8888
# if you want to manage config files from local file system uncomment the property bellow 
#spring.cloud.config.server.git.uri=file://<your_local_system_directory>

# if you want to manage config files from github repository then uncomment the property below
# alongwith user name and password
#spring.cloud.config.server.git.uri=<github_url>
spring.cloud.config.server.git.searchPaths=<some-name>
#spring.cloud.config.server.git.username=
#spring.cloud.config.server.git.password=

finally in maven dependencies include 
<dependency>
    <groupId>org.springframework.cloud</groupId>
    <artifactId>spring-cloud-starter-config</artifactId>
    <version>1.2.0.RELEASE</version>
</dependency>

Lets assume that spring.cloud.config.server.git.searchPaths attribute value is sample-config and we are supporting default, dev and production environment, so your github repo/local file system will have three different files like

sample-config.properties
sample-config-dev.properties
sample-config-production.properties

Now start the application and access 

http://localhost:8888/sample-config/defaut

output: {"name":"sample-config","profiles":["defaut"],"label":"master","version":"f57f675e23c02a9e2f8422b01f07302d41d9774f","propertySources":[{"name":"<github-default-url>","source":{"message":"some message for default"}}]}

http://localhost:8888/sample-config/dev

{"name":"sample-config","profiles":["dev"],"label":"master","version":"f57f675e23c02a9e2f8422b01f07302d41d9774f","propertySources":[{"name":"<github-default-url>","source":{"message":"some message for dev"}}, ........... ]}

http://localhost:8888/sample-config/production

{"name":"sample-config","profiles":["production"],"label":"master","version":"f57f675e23c02a9e2f8422b01f07302d41d9774f","propertySources":[{"name":"<github-default-url>","source":{"message":"some message for production"}}, ........ ]}

we are assuming here that we have only one property with key message in the property files.

So our spring cloud config server is now ready to support environment specific configuration properties for the client.

we can also health check with http://localhost:8888/sample-config/health

Spring Cloud Config Client

A Spring Boot application can take immediate advantage of the Spring Config Server. Lets see how -

@SpringBootApplication
@RefreshScope
public class ConfigClientApplication {

    public static void main(String[] args) {
        SpringApplication.run(ConfigClientApplication.class, args);
    }
}

@RestController
class MessageRestController {

    @Value("${message:Hello default}")
    private String message;

    @RequestMapping("/message")
    String getMessage() {
        return this.message;
    }
}

So we have a REST endpoint /message which returns the value of the key message from property files.

application.properties file

server.port=8080
spring.application.name=sample-config # this is the name we used in config server

#URL for config server
spring.cloud.config.uri=http://localhost:8888

# Active profile
spring.profiles.active=dev

finally add the dependency below in pom.xml

<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-starter-config</artifactId>
<version>1.2.0.RELEASE</version>
</dependency>

Lets start the server and access http://localhost:8080/message  [the output will be some message for dev

If we change the spring.profiles.active value the output will be changed accordingly. 

@RefreshScope

If we change the value of any of the key (value of the property files attribute key for any environment) we don't even need to restart the Config Server and Config Client, Just send HTTP POST request to http://localhost:8080/refresh the updated values will be reflected automatically. 

So we can use Spring Cloud config server to - 

1. Externalise the configuration properties (property files for applications) either to file system or github repository.
2. We can get the support of different profiles like dev, production seamlessly.
3. If at runtime we need to change any configuration attribute value we don't even need to restart the server.

Tuesday, 13 September 2016

JAVA In memory cache with time and size based eviction

In this blogpost we will see how we can create an in-memory cache with time and size based eviction. Time or time to live based eviction will check cache objects with last access time greater than a predefined value where as size based eviction will remove objects with LRU algorithm

Key for cache: 

public class CacheKey<T> {

private T key;

public CacheKey(T key) {
this.key = key;
}

public T getKey() {
return key;
}

public void setKey(T key) {
this.key = key;
}

}

Object as value for cache:

public class CachedObject<T> {

private long lastAccessedTime;

private T value;

public CachedObject(T value) {
this.lastAccessedTime = System.currentTimeMillis();
this.value = value;
}

public long getLastAccessedTime() {
return lastAccessedTime;
}

public void setLastAccessedTime(long lastAccessedTime) {
this.lastAccessedTime = lastAccessedTime;
}

public T getValue() {
return value;
}

public void setValue(T value) {
this.value = value;
}

LRU based cache storage :


public class LRUCache<K, V> {

private final int maxSize;

private ConcurrentHashMap<K, V> map;

private ConcurrentLinkedQueue<K> queue;

public LRUCache(final int maxSize) {
this.maxSize = maxSize;
map = new ConcurrentHashMap<K, V>(maxSize);
queue = new ConcurrentLinkedQueue<K>();
}

public void put(final K key, final V value) {
if (map.containsKey(key)) {
// remove the key from the FIFO queue
queue.remove(key);
}

while (queue.size() >= maxSize) {
K oldestKey = queue.poll();
System.out.println("Key to remove : " + oldestKey);
if (null != oldestKey) {
map.remove(oldestKey);
}
}
queue.add(key);
map.put(key, value);
}

public V get(final K key) {

if (map.containsKey(key)) {
// remove from queue and add it again in FIFO queue
queue.remove(key);
queue.add(key);
}
return map.get(key);
}

public void remove(final K key) {
if (map.containsKey(key)) {
// remove from queue and add it again in FIFO queue
queue.remove(key);
map.remove(key);
}
}

public ConcurrentHashMap<K, V> getMap() {
return map;
}

public void setMap(ConcurrentHashMap<K, V> map) {
this.map = map;
}

public ConcurrentLinkedQueue<K> getQueue() {
return queue;
}

public void setQueue(ConcurrentLinkedQueue<K> queue) {
this.queue = queue;
}

public int getMaxSize() {
return maxSize;
}

}

In memory cache implementation: 

public class InMemoryCache<K,V> {

private long timeToLive;

private LRUCache<CacheKey<?>, CachedObject<?>> cache;

public InMemoryCache(final long lifeTime, final long timerInterval, final int maxItems) {

this.timeToLive = lifeTime * 1000;

cache = new LRUCache<CacheKey<?>, CachedObject<?>>(maxItems);

if (timeToLive > 0 && timerInterval > 0) {

Thread t = new Thread(new Runnable() {
public void run() {
while (true) {
try {
Thread.sleep(timeToLive);
} catch (InterruptedException ex) {
}
cleanup();
}
}

private void cleanup() {

long now = System.currentTimeMillis();
List<CacheKey<?>> deleteKey = new ArrayList<CacheKey<?>>();

CacheKey<?> key = null;
CachedObject<?> cObject = null;

for (Map.Entry<CacheKey<?>, CachedObject<?>> entry : cache.getMap().entrySet()) {
key = entry.getKey();
cObject = entry.getValue();
if (cObject != null && (now > (timeToLive + cObject.getLastAccessedTime()))) {
deleteKey.add(key);
}
}

for (CacheKey<?> cacheKey : deleteKey) {
cache.remove(cacheKey);
Thread.yield();
}
}
});

t.setDaemon(true);
t.start();
}
}

public void put(CacheKey<?> key, CachedObject<?> value) {
cache.put(key, value);
}

public CachedObject<?> get(CacheKey<?> key) {

CachedObject<?> c = (CachedObject<?>) cache.get(key);

if (c == null)
return null;
else {
c.setLastAccessedTime(System.currentTimeMillis());
return c;
}
}

public void remove(CacheKey<?> key) {
cache.remove(key);
}

public int size() {
return cache.getMap().size();
}
}

So, in LRUCache storage implementation, once the predefined storage capacity is reached, it will check objects fetched earliest and remove that from storage. The key access order is maintained in a concurrent linked queue.

InMemoryCache keeps on running a background thread to check objects which has reached the time to live (idle time) and remove them from the cache.

Monday, 12 September 2016

Implement LRU cache in Java

In this blogpost we will see how we can create a LRU cache using JAVA collection APIs.

LRU (Least Recently Used) cache discards least recently used element from the cache when we need to free up some space from cache. This algorithm requires keeping track of what was used when, which is expensive if one wants to make sure the algorithm always discards the least recently used item. General implementations of this technique require keeping "age bits" for cache-lines and track the "Least Recently Used" cache-line based on age-bits. In such an implementation, every time a cache-line is used, the age of all other cache-lines changes.

We will use ConcurrentHashMap as cache storage and ConcurrentLinkedQueue to keep track of element access order from the cache. Lets see how the code looks like - 

import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.ConcurrentLinkedQueue;

public class LRUCache<K, V> {

private final int maxSize;

private ConcurrentHashMap<K, V> map;

private ConcurrentLinkedQueue<K> queue;

public LRUCache(final int maxSize) {
this.maxSize = maxSize;
map = new ConcurrentHashMap<K, V>(maxSize);
queue = new ConcurrentLinkedQueue<K>();
}

public void put(final K key, final V value) {
if (map.containsKey(key)) {
// remove the key from the FIFO queue
queue.remove(key);
}

while (queue.size() >= maxSize) {
K oldestKey = queue.poll();
if (null != oldestKey) {
map.remove(oldestKey);
}
}
queue.add(key);
map.put(key, value);
}

public V get(final K key) {

if (map.containsKey(key)) {
// remove from queue and add it again in FIFO queue
queue.remove(key);
queue.add(key);
}
return map.get(key);
}
}

Lets see the main class 

public class ValidateLRUCache {

public static void main(String[] args) {
LRUCache<Integer, String> cache = new LRUCache<>(5);
cache.put(1, "A");
cache.put(2, "B");
cache.put(3, "C");
cache.put(4, "D");
cache.put(5, "E");
// key 5 moved ahead
System.out.println(cache.get(5));
// put new element to cache. this will evict key 1 from cache
cache.put(6, "F");
//this will print null
System.out.println(cache.get(1));

}

}

In this example we are using ConcurrentLinkedQueue to maintain the access order. The poll method retrieves and removes the head of this queue, or returns null if this queue is empty. So while putting elements to the cache storage if we exceed the size limit, least recently used keys will be removed from the queue and element for that key will be removed from the cache. Similarly if we fetch some element from cache we have to change the access order. 

Monday, 30 May 2016

Distributed SolrCloud setup with external ZooKeeper ensemble

We all know that Solr search performs better than database queries because of "inverse index" rather than database queries with a full table scan. Databases and Solr have complementary strengths and weaknesses though. In this blogpost we will set up a SolrCloud just like a production system.

SolrCloud or Solr Master Slave
SolrCloud and master-slave both address four particular issues:
  • Sharding
  • Near Real Time (NRT) search and incremental indexing
  • Query distribution and load balancing
  • High Availability (HA) 
If our application just reads data from Solr and need high availability on reading data from Solr then a simple one master to many slave hierarchy is more than sufficient. But if you are looking out for high availability on writing to Solr too, then SolrCloud is a right option

Is SolrCloud is better? Maintaining SolrCloud needs a good infrastructure and have to look out the availability of ZooKeepers, and nodes health, high performance disk for better replication speed etc. But, other than this we don't need to worry about Data consistency among nodes as this will be taken care by SolrCloud.

In which cases is better to coose SolrCloud? 
When we need high availability on Solr Writes as well as reads, we have to go for SolrCloud. Also, if we cannot afford bigger machines to have one single node, then we can split index to shards and keep it under smaller config machines.

In which cases is better to choose Solr Replication? 
When our application does not write in real time to SOLR, Replication is enough and no need to get complicated with SolrCloud. Also, its comparatively easy to setup Master Slave than SolrCloud

Sharding and Data consistency, automatic rebalancing of shards are better in SolrCloud. Query distribution and load balancing is automatic for SolrCloud, in sharded environment for master slave we need to use distributed query.

To setup a distributed SolrCloud we must have the following - 
  1. At least 6 serves (3 for ZooKeeper cluster setup and 3 for SolrCloud setup)
  2. Zookeper
  3. Solr 6
For this blogpost we will be using a single system with three different ZooKeeper and three different Solr6 instances running on different port.

ZooKeeper Cloud Setup 

1. Download Apache ZooKeeper 3.4.6 is the version I am using
2. Create a directory, lets assume zk_cluster under $<home> .
3. Create three instances for ZooKeeper under zk_cluster, lets assume they are zookeeper-3.4.6_1,           zookeeper-3.4.6_2, zookeeper-3.4.6_3.
4. Create data and logs directory under zk_cluster directory.
5. Create ZooKeeper Server ID, basically this file reside in the ZooKeeper data directory.

At this point of time your ZooKeeper cluster will look like -

zk_cluster
|
    |-data
|---zookeeper-3.4.6_1
|--myid (content numeric 1)
|---zookeeper-3.4.6_2
|--myid (content numeric 2)
|---zookeeper-3.4.6_3
|--myid (content numeric 3)
|-log
|---zookeeper-3.4.6_1
|---zookeeper-3.4.6_2
|---zookeeper-3.4.6_3
|-zookeeper-3.4.6_1
|-zookeeper-3.4.6_2
|-zookeeper-3.4.6_3

6. Preparing ZooKeeper configuration called zoo.cfg at $<home>/zk_cluster/{zookeeper-3.4.6_1}/conf/zoo.cfg.  Here I will show you for Server 1. We have to perform same steps with appropriate values (clientPort, dataDir, dataLogDir) for respective ZooKeeper server.


# The number of milliseconds of each tick
tickTime=2000

# The number of ticks that the initial synchronization phase can take
initLimit=10

# The number of ticks that can pass between 
# sending a request and getting an acknowledgement
syncLimit=5

# the directory where the snapshot is stored.
# Choose appropriately for your environment
dataDir=$<home>/zk-cluster/data/zookeeper-3.4.6_1

# the port at which the clients will connect
clientPort=2181 <change for other instances>

# the directory where transaction log is stored.
# this parameter provides dedicated log device for ZooKeeper
dataLogDir=$<home>/zk-cluster/logs/zookeeper-3.4.6_1

# ZooKeeper server and its port no.
# ZooKeeper ensemble should know about every other machine in the ensemble
# specify server id by creating 'myid' file in the dataDir
# use hostname instead of IP address for convenient maintenance
server.1=localhost:2888:3888
server.2=localhost:2889:3889
server.3=localhost:2890:3890

7. Once zoo.cfg created for all the server then we can start the ZooKeeper Servers. ZooKeeper supports the following commands

  • start
  • start-foreground
  • stop
  • restart
  • status
  • upgrade
  • print-cmd

SolrCloud Setp

Before we create the Solr instances, we'll need to create a configset in order to create a collection to shard and replicate across multiple instances.  Creating a configset is very specific to our  collection. We can use the pre-built configsets that come with Solr 6, they are located in solr-6.0.0/server/solr/configsets and we don't have to do anything.

A custom configset requires taking care of path and third party libraries defined in solrconfig.xml file.We also have to create/update the schema.xml as necessary to map data from the source to a Solr document

Lets assume we are creating a configset named solr_cloud_example simply copying the content of  basic_configs. Additional libraries and schema can be updated before we actually start creating indexes.

Uploading a configset to Zookeeper

This is relevant if we want to upload  configuration ahead of time instead of specifying the configuration to use in the "create" command or if we are using the Collections API to issue a "create" command via the REST interface.


To upload the configset, we have to use zkcli.sh which is in <BASE_INSTALL_DIR>/solr-6.0.0/server/scripts/cloud-scripts.  Lets go to that directory and issue the following command:

./zkcli.sh -zkhost localhost:2181,localhost:2182,localhost:2183 -cmd upconfig -confname < solr_cloud_example > -confdir <base_installation_dir>/solr-6.0.0/sever/solr/configsets/< solr_cloud_example >/conf

This will upload the config directory in ZooKeeper cluster we have setup earlier.

Creating Solr Instances

Under the <base_directory> create a directory <solr_cluster> , download and copy three solr 6 installations. So the directory structure looks like  -

<base_dir>/<solr_cluster>
                            |
                            |-- solr-6.0.0_1
                                    | -- server
                                           |--solr
                                                 |--configsets
                                                       |-- <solr_cloud_example>
                                   
                            |-- solr-6.0.0_2
                            |-- solr-6.0.0_3


Now we are ready with the setup.

Start SolrCloud

At this point of time we have all the setup ready, before we start solr instances make sure the zookeeper cluster is up and running.

Goto
<base_dir>/solr-cluster/solr-6.0.1 and execute
bin/solr start -cloud  -p 8983 -z localhost:2181,localhost:2182,localhost:2183 -noprompt

Goto
<base_dir>/solr-cluster/solr-6.0.2 and execute
bin/solr start -cloud  -p 8984 -z localhost:2181,localhost:2182,localhost:2183 -noprompt

Goto
<base_dir>/solr-cluster/solr-6.0.3 and execute
bin/solr start -cloud  -p 8985 -z localhost:2181,localhost:2182,localhost:2183 -noprompt


once all the instances are running just type
http://localhost:8983/solr/admin/collections?action=CREATE&name=test_solr_cloud&numShards=2&replicationFactor=2&maxShardsPerNode=2
&collection.configName= solr_cloud_example to create a collection named solr_cloud_example

Now go to http://localhost:8983/solr/#/~cloud  you will see the collection along with shards and replications in different nodes