Training the model for iOS CoreML in Google Colab 60 times faster*

4 min readOct 25, 2018

Updated: May 2019

The new iOS project required the power of Machine Learning to detect certain objects on the scene captured by the camera and trigger the process.

This type of tasks is solved with the help of convolutional neural networks (CNN)(specifically “You look only once” (“YOLO”) architectures). There are several architectures of machine learning models that perform exceptionally well in this area, and even more, there are pre-trained models, allowing you to benefit from “transfer learning”.

When it comes to mobile development, you don’t even need to know what does CNN mean and how does it work, all you need is TuriCreate python package (see references here and here) and the set of labeled images to train your model.

Training with Google Colab and NVidia Tesla K80

Let’s assume you already familiar with TuriCreate and have a decent set of labeled data to start training your model. You can use your iMac and if you have MacOS 10.14, you can try leveraging the GPU you have. But in reality, you, unfortunately, can not (see GitHub issue). This is what happened to me and I had to train the model on CPU and it took forever.

The are several options to leverage the external GPU. Either setting up a complete development machine and Google Cloud Platform or Amazon with the GPU attached, installing python and Jupyter, etc, or use Google Collaboratory. The last is an amazing gift for the data science community from Google, even though it’s just a Jupyter notebook, it’s actually much more. This is a complete and fully managed environment, even including Tesla K80 GPU (for free at the moment! But I would not mind paying for it, as it’s a great service!).

There is a catch, however. TuriCreate requires MXNet 1.1.0 (current 1.3.0 is not compatible) and Google Collaboratory instance behind the Jupyter notebook is running CUDA 9.2 (NVidia calculation framework). Unfortunately, the older MXNext 1.1.0 with CUDA 9.+ support is slightly broke and causes:

ImportError for mxnet: cannot import name libinfo

TuriCreate and Google Collaboratory: Solution

It really takes a bit of time to set things up, but you only need to copy all the stuff from here. Let’s begin:

0. Enable GPU in Colab environment settings.

Uninstall CUDA 9.2 from Colab backing instance completely (few minutes).

# Remove CUDA 9 completely!apt-get --purge remove cuda nvidia* libnvidia-*
!dpkg -l | grep cuda- | awk '{print $2}' | xargs -n1 dpkg --purge
!apt-get remove cuda-*
!apt autoremove!apt-get update

2. Install CUDA 8.0 to the Colab Ubuntu backing instance. It will ask to agree on changing the sources list for apt, hit Y to continue (UPDATED: Nov‘18).

# Install CUDA 8!wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/cuda-repo-ubuntu1604_8.0.61-1_amd64.deb
!dpkg -i --force-overwrite cuda-repo-ubuntu1604_8.0.61-1_amd64.deb
!apt-get update
!apt-get install cuda-8-0# install will fail, need to force dpkg to overwrite the configuration file!wget http://archive.ubuntu.com/ubuntu/pool/main/m/mesa/libglx-mesa0_18.0.5-0ubuntu0~18.04.1_amd64.deb
!dpkg -i --force-overwrite libglx-mesa0_18.0.5-0ubuntu0~18.04.1_amd64.deb!wget http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/nvidia-410_410.48-0ubuntu1_amd64.deb
!dpkg -i --force-overwrite nvidia-410_410.48-0ubuntu1_amd64.deb!apt --fix-broken install
!apt-get install cuda-8-0

3. Install TuriCreate Package, uninstall mxnet, and install mxnet-cu80.

!pip install turicreate
# The worng version of MXNET will be installed.
!pip uninstall -y mxnet
# Instal CUDA8-compatible version of mxnet 1.1.0
!pip install mxnet-cu80==1.1.0

4. Install package to mount Google Drive as file-system folder to the backing Ubuntu instance, so that you can access your SFrame. This step will ask you to grant the access to GDrive from Colman and copy the auth tokens (twice).

# Mount Google Drive as OS-level filesystem. 1. Install necessary tooling!apt-get install -y -qq software-properties-common python-software-properties module-init-tools
!add-apt-repository -y ppa:alessandro-strada/ppa 2>&1 > /dev/null
!apt-get update -qq 2>&1 > /dev/null
!apt-get -y install -qq google-drive-ocamlfuse fuse
from google.colab import auth
auth.authenticate_user()
from oauth2client.client import GoogleCredentials
creds = GoogleCredentials.get_application_default()
import getpass
!google-drive-ocamlfuse -headless -id={creds.client_id} -secret={creds.client_secret} < /dev/null 2>&1 | grep URL
vcode = getpass.getpass()
!echo {vcode} | google-drive-ocamlfuse -headless -id={creds.client_id} -secret={creds.client_secret}# Create directory to mount gdrive!mkdir -p drive
!google-drive-ocamlfuse drive

5. Upload your SFrame (TuriCreate dataset) to Google Drive.

6. Import TuriCreate, configure GPUs, load SFrame and start training your model. Then export the model, but don’t forget to export it to /drive/… so it is stored safely in Google Drive and not deleted with the backing instance.

# Import turicreate and configure the GPUsimport turicreate as tc
tc.config.set_num_gpus(-1)# Load SFrame
data =  tc.SFrame('drive/Colab Notebooks/labled_data.sframe')# Make a train-test split
train_data, test_data = data.random_split(0.9)# Create and train model
model = tc.object_detector.create(train_data, feature='image',........)model.evaluate(test_data)
model.export_coreml('drive/Colab Notebooks/MyModel.mlmodel')

7. Optional, reduce the size of the final model, reducing the precision of the weights to the half-precision (see [3]).

!pip install -U coremltools
!pip uninstall -y tensorflow
!pip install tensorflow==1.5.0import coremltools
model_spec = coremltools.utils.load_spec('drive/Colab Notebooks/MyModel.mlmodel')model_fp16_spec = coremltools.utils.convert_neural_network_spec_weights_to_fp16(model_spec)
coremltools.utils.save_spec(model_fp16_spec, 'drive/Colab Notebooks/MyModel.16bit.mlmodel')

The speed you ask? In my case, it’s around 60x faster than using CPU. Well, this is obvious.

Conclusion

The configuration of Colab for TuriCreate is of course not the cup of tee, but the ease of use and the performance pay back every minute spent to setup this stuff. Hopefully, Apple will update TuriCreate to the newer MXNet 1.3 and the issues with CUDA dependencies will be resolved.

Enjoy your ML model!

References

WWDC 2018 — TruiCreate presentation https://developer.apple.com/videos/play/wwdc2018/712/
TuriCreate repository https://github.com/apple/turicreate
Reducing the Size of Your Core ML App https://developer.apple.com/documentation/coreml/reducing_the_size_of_your_core_ml_app

Training the model for iOS CoreML in Google Colab 60 times faster*

Training with Google Colab and NVidia Tesla K80

TuriCreate and Google Collaboratory: Solution

Conclusion

References

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Written by Nick Zamosenchuk

Responses (5)