Quick Start
Let’s bring up a single-node local instance of ForestFlow
-
Build ForestFlow
# Create a "local" build chmod +x ./build/build-local.sh ./build/build-local.sh
-
Set application environment to local
export APPLICATION_ENVIRONMENT_CONFIG=local
-
Define local working directory
This is where ForestFlow will store its state if using default persistence plugin for local install
mkdir -p ./ff-temp export LOCAL_WORKING_DIR_CONFIG=./ff-temp
-
Bring up ForestFlow
This is where ForestFlow will store its state if using default persistence plugin for local install See what the version number is defined as in the property
forestflow-latest.version
in pom.xmlAssuming it’s 0.2.3, run the following
export FORESTFLOW_VERSION=0.2.3 chmod +x ./build/run-local.sh ./build/run-local.sh
This should being up ForestFlow on port
8090
If you get any port conflicts feel free to change the ports used for JMX or for ForestFlow by supplying an environment variableAPPLICATION_HTTP_PORT_CONFIG
that will override the default 8090 port used. -
Let’s verify ForestFlow is running and we can access the API
curl 127.0.0.1:8090/contracts/list
You should see something like this:
{ "Contracts": { "contracts": [] } }
To pretty print from curl, you can pipe to python -m json.tool like so:
curl 127.0.0.1:8090/contracts/list | python -m json.tool
You can continue using curl but I prefer using HTTPie from the command line as it simplifies making REST API calls.
See installation for HTTPie
Using HTTPie
http 127.0.0.1:8090/contracts/list
-
Let’s deploy a model to ForestFlow
The git repo comes with a simple H2O model that we can use to test with. This sample H2O model uses the Combined Cycle Power Plant dataset and was created based on an online H2O tutorial that you can find here.
As the tutorial explains, the goal is to predict the energy output given some input features.
Have a look at the servable deployment definition https://github.com/ForestFlow/ForestFlow/tree/master/tests/basicapi-local-h2o.json
{ "path": "file://<local path for repo>/tests", "fqrv" : { "contract" : { "organization": "samples", "project": "energy_output", "contract_number": 0 }, "release_version": "StackedEnsemble_AllModels_AutoML_20191002_103122" }, "flavor": { "H2OMojo" : { "mojoPath": "StackedEnsemble_AllModels_AutoML_20191002_103122.zip", "version": "3.22.0.2" } }, "servableSettings": { "policySettings": { "validityPolicy": [ { "ImmediatelyValid": {} } ], "phaseInPolicy": { "ImmediatePhaseIn": {} } }, "loggingSettings": { "logLevel": "FULL", "keyFeatures": [ "reqid" ] } } }
This JSON payload is an example of what a model/servable deployment definition looks like. Notice how we specify validity, phase-in, and logging settings for the servable we’re about to create.
Let’s go ahead and create this Servable in ForestFlow, again, using HTTPie as a command-line HTTP client. We need to replace the
in the file with the path for where the file is located. We do this using `sed` and passing the result to `httpie`. Alternatively, edit the file and supply the correct path. export IP=127.0.0.1 export PORT=8090 # Deploy a model as a Servable to ForestFlow echo $(sed 's:<local path for repo>:'`pwd`':' ./tests/basicapi-local-h2o.json) | http POST http://${IP}:${PORT}/servable
If all goes well, you should see a ForestFlow response indicating the Servable was created successfully.
{ "ServableCreatedSuccessfully": { "fqrv": { "contract": { "contractNumber": 0, "organization": "samples", "project": "energy_output" }, "releaseVersion": "StackedEnsemble_AllModels_AutoML_20191002_103122" } } }
Now if we inspect the list of Contracts again we see the Servable and Contract we just deployed
http http://${IP}:${PORT}/contracts/list
{ "Contracts": { "contracts": [ { "contractNumber": 0, "organization": "samples", "project": "energy_output" } ] } }
We didn’t explicitly deploy a Contract here as is typically recommended. ForestFlow will automatically setup a Contract with default routing and expiration policy settings if you don’t provide one. Replying on this behavior is not recommended as it sets you up with ForestFlow defaults which might change in future releases.
It’s best to setup the Contract first and then deploy the Servable. Nonetheless, we have one setup and we can use it to score against.
You can also inspect the Servables under a Contract. In this case, we’ll see a single Servable deployed
http http://${IP}:${PORT}/samples/energy_output/0/list
[ { "FQRV": { "contract": { "contractNumber": 0, "organization": "samples", "project": "energy_output" }, "releaseVersion": "StackedEnsemble_AllModels_AutoML_20191002_103122" } } ]
-
Let’s score against the Servable we’ve deployed
Let’s get some metadata about the Servable we have deployed
http http://${IP}:${PORT}/samples/energy_output/0/StackedEnsemble_AllModels_AutoML_20191002_103122/metadata
{ "BasicMetaData": { "description": "ServableSettings(PolicySettings(List(ImmediatelyValid()),ImmediatePhaseIn()),Some(LoggingSettings(FULL,List(reqid),None)))", "fqrv": { "contract": { "contractNumber": 0, "organization": "samples", "project": "energy_output" }, "releaseVersion": "StackedEnsemble_AllModels_AutoML_20191002_103122" }, "inputs": [ { "description": "[0:TemperatureCelcius,1:ExhaustVacuumHg,2:AmbientPressureMillibar,3:RelativeHumidity]", "name": "numeric", "shape": [ "4" ], "type": "Float64" } ], "name": "prediction", "server": "ForestFlow" } }
Notice the Servable takes an input Tensor of type Float64 with 4 fields: TemperatureCelcius, ExhaustVacuumHg, AmbientPressureMillibar, and RelativeHumidity
Scoring against the model represented by the Servable we deployed is fairly simply.
Have a look at the contents of tests/basicapi-score-1.json{ "schema": [ { "type": "Float64", "fields": ["TemperatureCelcius", "ExhaustVacuumHg", "AmbientPressureMillibar", "RelativeHumidity"] } ], "datum": [ { "tensors": [ { "Float64": { "data": [33.2, 60.5, 1010.7, 30] } } ] }, { "tensors": [ { "Float64": { "data": [29.6, 62, 998.5, 40.62] } } ] } ], "configs" : {"reqid": "123-456-789"} }
We define the schema, essentially telling ForestFlow in what order to expect the provided features and then this particular request provides 2 datum records for inference, this is essentially a batch inference requests of 2 rows. The first supplies the Float64 tensor with the data points for each field starting with 33.2 for the Temperature in Celcius. The 2nd row does the same thing supplying 29.6 for the temp and so and so forth.
Scoring against the model is as simple as passing this as the body to the score API
http http://${IP}:${PORT}/samples/energy_output/0/score < tests/basicapi-score-1.json
{ "Prediction": { "datum": [ { "tensors": [ { "Float64": { "data": [ 438.18238719825354 ] } } ] }, { "tensors": [ { "Float64": { "data": [ 437.5805895467613 ] } } ] } ], "fqrv": { "contract": { "contractNumber": 0, "organization": "samples", "project": "energy_output" }, "releaseVersion": "StackedEnsemble_AllModels_AutoML_20191002_103122" }, "schema": [ { "fields": [ "Regression" ], "type": "Float64" } ] } }
ForestFlow returns a prediction for each row in addition to the FQRV of the Servable that responded with this prediction.
ForestFlow allows for multiple Servables to be deployed under the same Contract and for a routing strategy to determine which Servable responds to user requests and if the remaining Servables shadow the inference request for logging and performance monitoring purposes. See the section on Creating a Contract and routing for more details.
-
We can also inspect some stats ForestFlow collects about the use of Servables within a Contract
http http://${IP}:${PORT}/samples/energy_output/0/stats
[ { "ServableMetrics": { "becameValidAtMS": "1570054677902", "createdAtMS": "1570054677899", "currentPhaseInPct": 100, "fqrv": { "contract": { "contractNumber": 0, "organization": "samples", "project": "energy_output" }, "releaseVersion": "StackedEnsemble_AllModels_AutoML_20191002_103122" }, "scoreCount": "1", "shadeCount": "0" } } ]
Finally, you can also build and run ForestFlow in a container (docker, podman) and we provide scripts to help with this process. See Creating an OCI-compliant Image