Concepts

A ForestFlow Cluster

Servable

The smallest deployable entity in ForestFlow is called a Servable. This closely follows nomenclature used in TensorFlow Serving. A Servable represents an instance of a model loaded into memory for serving inference requests. Deploying a model means creating a Servable. Each Servable is uniquely identified by its Fully Qualified Release Version or FQRV for short.

Fully Qualified Release Version (FQRV)

The FQRV is one of the most important concepts within ForestFlow because routing and scoring (inference) is based on the FQRV.

The FQRV consists of a contract: Contract and release_version: String.

Release Version

The release_version along with the Contract MUST uniquely identify a deployed Servable/Model. The release version is used to distinguish between Servables with the same features that serve the same use case. The release version has no requirements for format; it’s a simple string however a good candidate might be the date a model was trained and the type of model deployed. This is how ForestFlow allows for multiple versions of a model to co-exist and serve the same use case.

The ability to produce and simultaneously deploy multiple versions of a model for the same use case allows for

In addition to deployment (Validity and Phase-In policies) a Contract, details below, defines expiration policies for the group of Servables defined within it.

This approach allows the user to define a myriad of scenarios. A few examples include:

Contract

A Contract is a struct of 3 elements:

The relationship between Contract to Servable is 1-to-Many. A Contract can have many Servables. The identifier of each Servable within a Contract is the aforementioned release version. So there’s a parent-child relationship between Contracts and Servables. The relationship is defined by the values of the FQRV.

Example of 2 Servables under the same Contract

FQRV for a Servable named “h2o_regression_v1”:

{ 
  "fqrv" : {
    "contract" : {
      "organization": "DreamWorks",
      "project": "schedule",
      "contract_number": 0
    },
    "release_version": "h2o_regression_v1"
  }
}

FQRV for a Servable named “h2o_regression_v2”:

{
  "fqrv" : {
    "contract" : {
      "organization": "DreamWorks",
      "project": "schedule",
      "contract_number": 0
    },
    "release_version": "h2o_regression_v2"
  }
}

These 2 Servables share the same Contract defined by the organization, project and contract_number.

When we define a router and expiration policy for the Contract. The Contract then uses these settings to understand how it needs to route requests between the deployed Servables within it. Similarly the expiration policy on the Contract applies to these 2 servables.

FQRV Extraction

The FQRV is defined at the time of model deployment. ForestFlow has support for automatic FQRV extraction for some protocols when fetching a model. Git would be a good example. FQRV extraction is supported if a certain tagging convention is used otherwise an FQRV with the Serve deployment request is required. See Servable (Model) Deployment for more details.

Servable (Model) Deployment

Deploying a model and creating a Servable in ForestFlow is a simple REST API call with parameters to configure policies for the Contract and Servable. If this is a new use case/Contract, the general recommendation is to first define and create the Contract and Contract Settings for the use case the Servable is meant to be deployed to.

Creating a Contract

Recall that a Contract consists of potentially more than one Servable. The Contract Settings determine how a Contract routes traffic between its underlying Servables in addition to when it considers a Servable expired (expiration policy) and removes it.

The payload is a JSON that represents Contract Settings which defines an expiration policy and router. The Expiration Policy governs how and when Servables are marked as expired and removed from ForestFlow and the Contract. The Router controls how traffic is distributed across different Servables, if any, within a Contract.

Example:

http POST https://forestflow.com/contract/DreamWorks/schedule/1
{
  "expirationPolicy": {
    "KeepLatest": {
      "servablesToKeep": 2
    }
  },
  "router": {
    "LatestPhaseInPctBasedRouter": {}
  }
}

In this example we define a new Contract with Organization = DreamWorks, Project = schedule, Contract Number = 1 This Contract has a Keep Latest expiration policy set to keep the 2 most recently active Servables. Anything beyond that is expired and removed. Additionally the Contract sets up the Router such that traffic only goes to the most recently active Servable assuming it’s been fully phased-in based on the Servable’s own Phase-In policy.

The effect of this setup is that once a Servable is fully phased-in and made active (active state is based on the Servable’s Validity Policy) it will completely take over the previous Servable’s inference requests however because the expiration policy keeps the last 2 active Servables, the previous Servable, prior to the now most recent one, will remain active in Shadow Mode, essentially replicating the work and logging its results but not responding directly to user inference requests.

The following diagram illustrates this scenario with 2 Servables (FOO, and BAR) under the same Contract.

ForestFlow

Currently available Expiration Policy implementations are:

Currently available Router implementations are:

Creating a Servable

After setting up a Contract, deploying a model and creating a Servable is a simple REST call. The REST call can either reference where the MLmodel yaml file is if using MLFlow In ForestFlow this is referred to as the MLFlowServeRequest.

OR

Simply provide all necessary information as part of the request itself as defined here. In ForestFlow this is referred to as the BasicServeRequest. We start off with the BasicServeRequest and then expand on how the MLFlowServeRequest differs.