A ForestFlow Cluster


The smallest deployable entity in ForestFlow is called a Servable. This closely follows nomenclature used in TensorFlow Serving. A Servable represents an instance of a model loaded into memory for serving inference requests. Deploying a model means creating a Servable. Each Servable is uniquely identified by its Fully Qualified Release Version or FQRV for short.

Fully Qualified Release Version (FQRV)

The FQRV is one of the most important concepts within ForestFlow because routing and scoring (inference) is based on the FQRV.

The FQRV consists of a contract: Contract and release_version: String.

Release Version

The release_version along with the Contract MUST uniquely identify a deployed Servable/Model. The release version is used to distinguish between Servables with the same features that serve the same use case. The release version has no requirements for format; it’s a simple string however a good candidate might be the date a model was trained and the type of model deployed. This is how ForestFlow allows for multiple versions of a model to co-exist and serve the same use case.

The ability to produce and simultaneously deploy multiple versions of a model for the same use case allows for

In addition to deployment (Validity and Phase-In policies) a Contract, details below, defines expiration policies for the group of Servables defined within it.

This approach allows the user to define a myriad of scenarios. A few examples include:


A Contract is a struct of 3 elements:

The relationship between Contract to Servable is 1-to-Many. A Contract can have many Servables. The identifier of each Servable within a Contract is the aforementioned release version. So there’s a parent-child relationship between Contracts and Servables. The relationship is defined by the values of the FQRV.

Example of 2 Servables under the same Contract

FQRV for a Servable named “h2o_regression_v1”:

  "fqrv" : {
    "contract" : {
      "organization": "DreamWorks",
      "project": "schedule",
      "contract_number": 0
    "release_version": "h2o_regression_v1"

FQRV for a Servable named “h2o_regression_v2”:

  "fqrv" : {
    "contract" : {
      "organization": "DreamWorks",
      "project": "schedule",
      "contract_number": 0
    "release_version": "h2o_regression_v2"

These 2 Servables share the same Contract defined by the organization, project and contract_number.

When we define a router and expiration policy for the Contract. The Contract then uses these settings to understand how it needs to route requests between the deployed Servables within it. Similarly the expiration policy on the Contract applies to these 2 servables.

FQRV Extraction

The FQRV is defined at the time of model deployment. ForestFlow has support for automatic FQRV extraction for some protocols when fetching a model. Git would be a good example. FQRV extraction is supported if a certain tagging convention is used otherwise an FQRV with the Serve deployment request is required. See Servable (Model) Deployment for more details.

Servable (Model) Deployment

Deploying a model and creating a Servable in ForestFlow is a simple REST API call with parameters to configure policies for the Contract and Servable. If this is a new use case/Contract, the general recommendation is to first define and create the Contract and Contract Settings for the use case the Servable is meant to be deployed to.

Creating a Contract

Recall that a Contract consists of potentially more than one Servable. The Contract Settings determine how a Contract routes traffic between its underlying Servables in addition to when it considers a Servable expired (expiration policy) and removes it.

The payload is a JSON that represents Contract Settings which defines an expiration policy and router. The Expiration Policy governs how and when Servables are marked as expired and removed from ForestFlow and the Contract. The Router controls how traffic is distributed across different Servables, if any, within a Contract.


http POST
  "expirationPolicy": {
    "KeepLatest": {
      "servablesToKeep": 2
  "router": {
    "LatestPhaseInPctBasedRouter": {}

In this example we define a new Contract with Organization = DreamWorks, Project = schedule, Contract Number = 1 This Contract has a Keep Latest expiration policy set to keep the 2 most recently active Servables. Anything beyond that is expired and removed. Additionally the Contract sets up the Router such that traffic only goes to the most recently active Servable assuming it’s been fully phased-in based on the Servable’s own Phase-In policy.

The effect of this setup is that once a Servable is fully phased-in and made active (active state is based on the Servable’s Validity Policy) it will completely take over the previous Servable’s inference requests however because the expiration policy keeps the last 2 active Servables, the previous Servable, prior to the now most recent one, will remain active in Shadow Mode, essentially replicating the work and logging its results but not responding directly to user inference requests.

The following diagram illustrates this scenario with 2 Servables (FOO, and BAR) under the same Contract.


Currently available Expiration Policy implementations are:

Currently available Router implementations are:

Creating a Servable

After setting up a Contract, deploying a model and creating a Servable is a simple REST call. The REST call can either reference where the MLmodel yaml file is if using MLFlow In ForestFlow this is referred to as the MLFlowServeRequest.


Simply provide all necessary information as part of the request itself as defined here. In ForestFlow this is referred to as the BasicServeRequest. We start off with the BasicServeRequest and then expand on how the MLFlowServeRequest differs.

Using S3 Storage

To use S3 as a storage backend, there are two requirements that should be satisfied: 1) s3 protocol: The s3 protocol requires that you specify the s3 endpoint, bucket name and optionally region in the format described below, please note that the s3 protocol is what becomes the path variable in a serve request:

`endpoint<:port> bucket=<s3-bucket-name> [region=<s3-region-name>]`

2) credentials: ForestFlow looks for the S3 credentials, i.e. access_key_id and secret, that are made accessible to FF through either the typesafe config or environment variables as described in this section.

The s3 credentials’ lookup follows a 3-layered approach with a fallback strategy onto the higher levels, with the 3 layers being

 1) `<s3-url-authority>_<s3-url-bucket>_<s3-bucket-key>` (most specific, lowest level)
 2) `<s3-url-authority>_<s3-url-bucket>`
 3) `<s3-url-authority>` (least specific, highest level). 

This layered approach is used to represent the idea of credential inheritance, in that the less specific (higher level) credentials are inherited by all matching lower level S3 layers unless a more specific credential is supplied.

For instance, if only <s3-url-authority>_<s3-url-bucket> credentials are supplied, then it is inferred that all the keys available under that specific bucket would inherit these credentials. Similarly, if only <s3-url-authority> credentials are supplied, then it is inferred that all the buckets and keys available under that specific domain would inherit these domain credentials.