Provisioning a big data cluster is an elaborate undertaking. You have to install not only the hardware, but also a lot of software:
So, you'll can imagine that it's already a best practice to automate the installation. A repeatable process is key! Various tools were built for that purpose. Chef, Puppet and Ansible are the most popular.
But, big data clusters are by definition a bunch of almost identical servers. So, provisioning a big data cluster is like writing sentences in grade school. You have to write the same sentences over and over again. You have to execute almost the same install sequence on each node of the cluster.
Those tools for infrastructure automation behave like an orchestra's conductor. They connect to each node to pass the score and tell what to play at any given time. The conductor tool monitors and intervenes in case of problems.
The firmware on the BigBoards Hexes is at version 3.0 already! We learned that the conductor approach to provisioning is not the fastest, nor the stablest. In case of installation issues, also not the easiest to debug, fix and relaunch!
Each node is an autonomous agent. It knows what to do at any given cue. In case of issues, the node knows how to recover and relaunch the failed step before continuing the play. Such an approach is more efficient, has less moving parts and more resilient to errors.
The BigBoards Fabric is the latest version for the firmware of our Hexes. It implements the agent principle which we outlined above. The Fabric slashes the installation time for a Tint in our Library from minutes into seconds! You can see how big an improvement this new approach is!
Moreover, the BigBoards Fabric is a simple application, that you can just install on any Linux server! The Fabric is your entry point into your cluster. You can manage the whole cluster from a single node … any of the nodes! It’s a simple interface (ReST API) to manage:
The BigBoards Fabric looks like a very interesting component in a software-defined architecture
This ia a major milestone for BigBoards! Not only is the BigBoards Fabric our 1st open source project, but the fabric also allows you to manage clusters on your own servers! These servers can be physical servers that sit in your datacenter, but the servers can also be virtual servers that you rent from your favourite cloud provider!
The BigBoards Fabric is a cornerstone in our solution strategy. We aim to cover the full application life-cycle management for cluster applications. That’s what the BigBoards Hive will become. But that’s for another post.
How do you currently manage the life cycle of your big data projects?
Please share it with everyone in a comment below — we’d love to hear about it too.