Marty McFly's Delorean from Back To The Future exhibited at CeBIT 2016
CeBIT 2016 – Day 5 (18 March 2016)
March 24, 2016
Apache Ignite In-Memory Data Fabric consists of a bunch of components collaborating as jigsaw puzzles: data grid, compute grid, service grid, streaming, hadoop acceleration, advanced clustering, file system, messaging, events and data structures
How to create your own BigBoards app?
November 30, 2016
Show all

The BigBoards Fabric, a major milestone

Starting with a golden circle and the question "Why?"

Why? How? What? The golden circle

Let’s have a look at the following golden circle to give you some insights into the core of what BigBoards is.

Why a Big Data Fabric?

Provisioning a big data cluster is an elaborate undertaking. You have to install not only the hardware, but also a lot of software:

  • Basic operating system to actually have servers running.
  • Monitoring tools to gain insights in how your cluster is behaving at runtime.
  • Big data technology to actually combine all the single servers into a data cluster. Most likely, you need to combine various technologies. That's because the swiss army knife of big data doesn't exist yet.
  • Finally, all the extra libraries that you need in support of your specific solution.

So, you'll can imagine that it's already a best practice to automate the installation. A repeatable process is key! Various tools were built for that purpose. Chef, Puppet and Ansible are the most popular.

But, big data clusters are by definition a bunch of almost identical servers. So, provisioning a big data cluster is like writing sentences in grade school. You have to write the same sentences over and over again. You have to execute almost the same install sequence on each node of the cluster.

Those tools for infrastructure automation behave like an orchestra's conductor. They connect to each node to pass the score and tell what to play at any given time. The conductor tool monitors and intervenes in case of problems.

The firmware on the BigBoards Hexes is at version 3.0 already! We learned that the conductor approach to provisioning is not the fastest, nor the stablest. In case of installation issues, also not the easiest to debug, fix and relaunch!

Provisioning a big data cluster is like writing sentences in grade school. You have to write the same sentences over and over again.

How to build a better fabric?

Each node is an autonomous agent. It knows what to do at any given cue. In case of issues, the node knows how to recover and relaunch the failed step before continuing the play. Such an approach is more efficient, has less moving parts and more resilient to errors.

What is the BigBoards Fabric?

The BigBoards Fabric is the latest version for the firmware of our Hexes. It implements the agent principle which we outlined above. The Fabric slashes the installation time for a Tint in our Library from minutes into seconds! You can see how big an improvement this new approach is!

Moreover, the BigBoards Fabric is a simple application, that you can just install on any Linux server! The Fabric is your entry point into your cluster. You can manage the whole cluster from a single node … any of the nodes! It’s a simple interface (ReST API) to manage:

  • the status of a node
  • the cluster membership of a node
  • the cluster itself
  • the installed tints on the cluster

The BigBoards Fabric looks like a very interesting component in a software-defined architecture

Toni Verbeiren

This is a huge milestone for BigBoards!

On 1 April 2016, we released the BigBoards Fabric as an open source project (Apache 2.0 license)! Nope, not April Fools. Have a look at http://github.com/bigboards/bigboards-fabric.

This ia a major milestone for BigBoards! Not only is the BigBoards Fabric our 1st open source project, but the fabric also allows you to manage clusters on your own servers! These servers can be physical servers that sit in your datacenter, but the servers can also be virtual servers that you rent from your favourite cloud provider!

The BigBoards Fabric is a cornerstone in our solution strategy. We aim to cover the full application life-cycle management for cluster applications. That’s what the BigBoards Hive will become. But that’s for another post.

How do you currently manage the life cycle of your big data projects?
Please share it with everyone in a comment below — we’d love to hear about it too.

Wim Van Leuven
Wim Van Leuven
Big Data Enthusiast, Technology Leader, Hands-on I am passionate about software and solutions. Organizer of BigData.be, the Belgian community on big data and NoSQL. Co-founder of BigBoards.io.

Leave a Reply

Your email address will not be published. Required fields are marked *

By submitting this form, you accept the Mollom privacy policy.