Run your Keras models in C++ Tensorflow

So you’ve built an awesome machine learning model in Keras and now you want to run it natively thru Tensorflow. This tutorial will show you how. All of the code in this tutorial can be cloned / downloaded from . You may want to clone it to follow along.

Keras is a wonderful high level framework for building machine learning models. It is able to utilize multiple backends such as Tensorflow or Theano to do so. When a Keras model is saved via the .save method, the canonical save method serializes to an HDF5 format. Tensorflow works with Protocol Buffers, and therefore loads and saves .pb files. This tutorial demonstrates how to:

  • build a SIMPLE Convolutional Neural Network in Keras for image classification
  • save the Keras model as an HDF5 model
  • verify the Keras model
  • convert the HDF5 model to a Protocol Buffer
  • build a Tensorflow C++ shared library
  • utilize the .pb in a pure Tensorflow app
    • We will utilize Tensorflow’s own example code for this

I am conducting this tutorial on Linux Mint 18.1, using GPU accelerated Tensorflow version 1.1.0 and Keras version 2.0.4. I have run this on Tensorflow v.1.3.0 as well.


A NOTE ABOUT WINDOWS: Everything here SHOULD work on Windows as well until we reach C++. Building Tensorflow on Windows is a bit different (and to this point a bit more challenging) and I haven’t fully vetted the C++ portion of this tutorial on Windows yet.  I will update this post upon vetting Windows.


  • You are familiar with Python (and C++ if you’re interested in the C++ portion of this tutorial)
  • You are familiar with Keras and Tensorflow and already have your dev environment setup
  • Example code is utilizing Python 3.5, if you are using 2.7 you may have to make modifications

Get a dataset

I’m assuming that if you’re interested in this topic  you probably already have some image classification data. You may use that or follow along with this tutorial where we use the flowers data from the Tensorflow examples. It’s about 218 MB and you can download it from

After extracting the data you should see a folder structure similar to the image shown here. There are 5 categories and the data is pre-sorted into test and train.

Train your model

I will use a VERY simple CNN for this example, however the techniques to port the models work equally well with the built-in Keras models such as Inception and ResNet. I have no illusions that this model will win any awards, but it will serve our purpose.

There are a few things to note from the code listed below:

  • Label your input and output layer(s) – this will make it easier to debug when the model is converted.
  • I’m relying on the Model Checkpoint to save my .h5 files – you could also just call after the training is complete.
  • Make note of the shape parameter you utilize, we will need that when we run the model later.

I down-sampled the imagery significantly and ran the model more than I needed to, but here was the command I ran (NOTE: I ran this on some old hardware using GPU acceleration on a NVIDIA GTX-660 – you can probably increase the batch size significantly assuming you have better hardware):

A few runs of this yielded val_acc in the 83-86% range, and while it’s no Inception, it’s good enough for this exercise.

Test your model

So now let’s just do a quick gut-check on our model – here’s a small script to load your model, image, shape and indices (especially if you didn’t use the flowers set):

Here’s a few examples of my runs for reference:

Alright – not too bad, now for the fun part.

Convert from HDF5 to .pb

Attribution: This script was adapted from

I adapted the notebook from the link above to a script we can run from the command line. The code is almost identical except for the argument parsing. This code does the following:

  • Loads your .h5 file
  • Replaces your output tensor(s) with a named Identity Tensor – this can be helpful if you are using a model you didn’t build and don’t know all of the output names (of course you could go digging, but this avoids that).
  • Saves an ASCII representation of the graph definition. I use this to verify my input and output names for Tensorflow. This can be useful in debugging.
  • Replaces all variables within the graph to constants.
  • Writes the resulting graph to the output name you specify in the script.

With that said here’s the code:

When you run the code, it’s important that you make the prefix name unique to the graph. If you didn’t build the graph, using something like “output” or some other generic name has the potential of colliding with a node of the same name within the graph. I recommend making the prefix name uniquely identifiable, and for that reason this script defaults the prefix to “k2tfout” though you can override that with whatever you prefer.

And now let’s run this little guy on our trained model.

As you can see, two files were written out. An ASCII and .pb file. Let’s look at the graph structure, notice the input node name “firstConv2D_input” and the output name “k2tfout_0”, we will use those in the next section:

Use Tensorflow’s label_image examples:

The remainder of this tutorial will heavily leverage Tensorflow’s image recognition examples. Specifically this file for python and this file for C++.

I copied both of those files into the git repo for this tutorial. Now let’s test them out.

Running your Tensorflow model with Python

Running the Python script is fairly straight forward. Remember, we need to supply the following arguments:

  • the output_graph.pb we generated above
  • the labels file – this is supplied with the dataset but you could generate a similar labels.txt from the indices.txt file we produced in our Keras model training
  • input width and height. Remember I trained with 80×80 so I must adjust for that here
  • The input layer name – I find this in the generated ASCII file from the conversion we did above. In this case it is “firstConv2D_input” – Remember our named the first layer “firstConv2D”.
  • The output layer name – We created this with prefix and can verify it in our ASCII file. We went with the script default which was “k2tfout_0”
  • Finally, the image we want to process.

Let’s try it:

So now Tensorflow is running our model in Python – but how do we get to C++?

Running your Tensorflow model with C++

If you are still reading then I’m assuming you need to figure out how to run Tensorflow in a production environment on C++. This is where I landed, and I had to bounce between fragments of tutorials to get things to work. Hopefully the information here will give you a consolidated view of how to accomplish this.

For my project, I wanted to have a Tensorflow shared library that I could link and deploy. That’s what we’ll build in this project and build the label_image example with it.

To run our models in C++ we first need to obtain the Tensorflow source tree. The instructions are here, but we’ll walk thru them below.

Now that we have the source code, we need the tools to build it. On Linux or Mac Tensorflow uses Bazel. Windows uses CMake (I tried using Bazel on Windows but was not able to get it to work).  Again installation instructions for Linux are here, and Mac here, but we’ll walk thru the Linux instructions below. Of course there may be some other dependencies but I’m assuming if you’re taking on building Tensorflow, this isn’t your first rodeo.

Install Python dependencies (I’m using 3.x, for 2.x omit the ‘3’)

Install JDK 8, you can use either Oracle or OpenJDK. I’m using openjdk-8 on my system (in fact I think I already had it installed). If you don’t, simply type:

NOTE: I have not tested building with CUDA – this is just the documentation that I’ve read. For deployment I didn’t want to build with CUDA, however if you do then you of course need the CUDA SDK and the CUDNN code from NVIDIA. You’ll also need to grab libcupti-dev.

Next, lets install Bazel:

At this point you should be able to run bazel help and get feedback:

Now that we have everything installed, we can configure and build. Make sure you’re in the top-level tensorflow directory. I went with all the default configuration options. When you do this, the configuration tool will download a bunch of dependencies – this make take a minute or two.

Alright, we’re configured, now it’s time to build.  DISCLAIMER: I AM NOT a Bazel guru. I found these settings via Google-Fu and digging around in the configuration files. I could not find a way to get Bazel to dump the available targets. Most tutorials I saw are about building the pip target – however, I wanted to build a .so.  I started looking thru the BUILD files to find targets and found these in tensorflow/tensorflow/BUILD

So with that in mind here’s the command for doing this (you may want to alter jobs based on your number of cores and RAM – also you can remove avx, mfpmath and msse4.2 optimizations if you wish):

Go get some coffee, breakfast, lunch or watch a show. This will grind for a while, but you should end up with bazel-bin/tensorflow/

Let’s run it!

In the tutorial git repo, I’m including a qmake .pro file that links the .so and all of the required header locations. I’m including it for reference – you DO NOT need qmake to build this. In fact, I’m including the g++ commands to build. You may have to adjust for your environment. Assuming you’re building main.cpp from the root of the repo, and the tensorflow build we just created was cloned and built in the same directory, all paths should be relative and work out of the box.

I included a, here are the commands:

Now you should have an executable in your directory and we can now test our application:

So there we have it. I did notice that the percentages aren’t exactly the same as when I ran the Keras models directly. I’m not sure if this is a difference in compiler settings or if Keras is overriding some calculations. But this gets you well on your way to running a Keras model in C++.

I hope this was useful to you.