Why you should choose ML Kit over Mobile Vision

In a previous blog post, we explained how developers can use Google’s ML Kit’s Barcode Scanning Module to scan QR codes via a camera or import images from the gallery. In this blog post we will compare the functionality and implementation of ML Kit with that of Mobile Vision.

Functional comparison

To the end user, there’s likely no discernable difference between the two technologies. Even the ML Kit documentation states they are essentially the same: “The Barcode scanning, Text recognition and Face detection APIs provide the same functionality and capabilities as their Mobile Vision counter-parts”. However, this doesn’t mean there are no functional benefits to choosing ML Kit. It’s important to read the documentation further, where it goes on to state: “Migrating to ML Kit ensures your application benefits from the latest bug fixes and improvements to the APIs, including updated ML models and hardware acceleration”.

Improvements of ML Kit over Mobile Vision

Let’s go over some of the improvements.

Improved recall. This means an improvement in the correct detection of barcodes out of all formats. For example, ML Kit 16.0.0 added support for broken PDF417 start/stop pattern detection.
Tolerance for low-quality images. Barcode detection is very sensitive to image quality. So by improving their tolerance, it directly boosts usability for people with lower quality cameras.
Long tail latency. There have been improvements in what the worst 1% of latency experienced by users is at. This is important because pain points in latency often leave a lasting impression on those users.
Bounding box stability. This refers to the minimization of boundary fluctuations around a detected object, commonly known as jittering.
Integration with CameraX and Camera2. These are two camera libraries in Android which have benefits per use case, and do not restrict you to one or the other when using ML Kit which provides us with more flexibility.
Support for Android Jetpack Lifecycle. Simpler integration with changes such as screen rotations, which are always a pain to handle if there’s no lifecycle support.

Code comparison

The differences between Mobile Vision and ML Kit aren’t restricted to solely functionality. In this section we’ll look over some key differences in the code used to implement them.

Embedding the camera into the fragment.

In ML Kit we do it as follows:

cameraProvider?.unbindAll()
cameraProvider?.bindToLifecycle(
  lifecycleOwner,
  cameraselector,
  preview,
  imageCapture,
  imageAnalyzer
)
preview?.setSurfaceProvider(
  finderView.surfaceProvider
)

Here preview corresponds to an instance of a PreviewView which is defined on the fragment’s layout. This PreviewView is where the camera will be shown.

In Mobile Vision the following is needed:

var surfaceViewCallback: SurfaceHolder.Callback? = object:
SurfaceHolder.Callback {
  override fun surfaceCreated(holder: SurfaceHolder) {
    initCamera()
  }

  override fun surfaceChanged(holder: SurfaceHolder, format: Int, width: Int, height: Int) {}

  override fun surfaceDestroyed(holder: SurfaceHolder) {
      releaseResources()
  }
}

surfaceView.apply {
  holder.addCallback(surfaceViewCallback)
}

Here surfaceView is an instance of SurfaceView which is where the camera will be shown. This is defined on the layout’s xml.

private fun initCamera() {
  surfaceView?.setWillNotDraw(false)
  try {
    cameraSource?.start(surfaceView.holder)
  } catch (e: I0Exception) {
    e.printStackTrace()
  }
}

Here we can see how much simpler it is to entangle our view and camera when using ML Kit instead of Mobile Vision. This should come as no surprise considering the emphasis on facilitating lifecycle integration.

Scanning and detection.

In ML Kit we do the following:

fun detectInImage(image: InputImage): Task<List<Barcode>>? {
  return scanner?.process(image)
}

scanner is an instance of the BarcodeScanner interface.

override fun analyze(imageProxy: ImageProxy) {
  if (!isProcessingCode) {
    val mediaImage = imageProxy.image
    mediaImage?.let {
      isProcessingCode = true
      detectInImage(InputImage.fromMediaImage(it, imageProxy.imageInfo.rotationDegrees))
        ?.addOnSuccessListener { results ->
          onSuccess(results)
        }
        ?.addOnFailureListener { e ->
          onFailure(e)
        }
        ?.addOnCompleteListener {
          imageProxy.image?.close()
          imageProxy.close()
        }
    }
  }
}

imageProxy contains the detected image, and said image is passed to the scanner to detect a barcode (which in this specific case is a QR).

private fun onSuccess(results: List<Barcode>) {
  if (results.isNotEmpty() {
    results[zero].run {
      rawValue?.trim()?.let { obtainedString ->
        qrReadPublisher.processDetectedQr(obtainedString)
        stop()
      }
    }
  } else {
    isProcessingCode = false
  }
}

QrReadPublisher is an interface implemented by the fragment.

In Mobile Vision we do the following:

detector?.setProcessor(object: Detector.Processor<Barcode> {
  override fun release() {}

  override fun receiveDetections(detections:
Detector.Detections<Barcode>) {
    if (!isProcessingCode) {
      val items = detections.detectedItems
      if (items.size() > 0) {
        isProcessingCode = true
        items.get(items.keyAt(zero)).run {
          readQr(displayValue)
        }
        releaseResources()
      }
    }
  }
})

Here we can see how in ML Kit we do need more code which may raise some concerns, and rightly so. Generally, we want to reduce the amount of code we have. This promotes the search for more elegant, simpler, and idiomatic solutions. However, there’s a good reason for these extra lines and methods. The extra callbacks used (addOnSuccessListener, addOnFailureListener, and addOnCompleteListener) provide us with a more in-depth handling of events, and then we can improve readability by using helper methods to handle the success and failure events.

Conclusion

Let’s recap what we just discussed:

ML Kit provides improvements in functionality thanks to it being supported by Google which isn’t the case for Mobile Vision anymore.
ML Kit provides improvements during implementation to readability, lifecycle and other event handling.

Here at Qubika, we used ML Kit for one of our client’s Android applications to handle QR code scanning. One of the things we encountered is a recurrent error with the availability of the scanning model. This issue is generally solved by retrying a moment later. A link to known issues is included below. In conclusion, ML Kit gives us an improved, constantly updated, and easier to implement alternative to Mobile Vision.

Why you should choose ML Kit over Mobile Vision

Functional comparison

Improvements of ML Kit over Mobile Vision

Code comparison

Conclusion

Resources

News and things that inspire us

Let’s work together

Why you should choose ML Kit over Mobile Vision

Functional comparison

Improvements of ML Kit over Mobile Vision

Code comparison

Conclusion

Resources

News and things that inspire us

Related Articles

Unleashing creativity through Apple Vision Pro: A dive into spatial computing

Introduction to ML Kit’s Barcode Module

Experience design and UX: What is it and why is it important?

News and things that inspire us

Let’s work together

Contact Us