Intro to Liveness Detection with React Native

Dec 28, 2020 • 25 min read

What is Liveness Detection, why would you need it and how to build an implementation with Expo's FaceDetector.

It's been over a year since the pandemic began and social distancing measures are still ongoing for good reason. Naturally, this has been quite the hardship for establishments that depend on the physical presence of customers. What it has also been is the leading factor pushing companies towards the adoption of a more digital operating model.

“Over the last few months, we’ve seen years-long digital transformation roadmaps compressed into days and weeks in order to adapt to the new normal as a result of COVID-19.” - Glenn Weinstein, CCO at Twilio

If a business has yet to digitalize their services, they're likely missing out on some revenue. However, different businesses have different concerns and as such, some may be quick to adapt while others remain wary. Financial institutions in particular are typically nervous about digital identity fraud.

Who led the digital transformation of your comapny? Covid-19

Are You a Robot?

Most banks still require in person meetings to open an account. Few allow you to open one remotely, albeit with limitations on what it can do. The reason is risk. Your physical presence with documents up close is more trustworthy than an online application — where everything is easier to fake.

Enter eKYC (electronically know your customer). It's the category of techniques and approaches to digital identity verification. One of which, is Liveness Detection.

The Turing Test could be described as a challenge which determines if a machine could be mistaken as human for a specific mode of interactions. In contrast, Liveness Detection is a test to uncover a machine that is pretending to be human.

In 2019, Facebook terminated 5.4 billion fake accounts. Adding a liveness test to an onboarding process would certainly reduce spambots.

But how would it prevent digital identity fraud? Unfortunately, it does not. Not on its own. When it comes to digital identity verification, there is no silver bullet.

Rather, liveness detection can make it harder for someone to use your identity. It's usually not used alone and can be part of a larger verification process, typically in the onboarding stage. For a high level of confidence, it's very important that this process covers all bases. Liveness is one of them.

How would that online onboarding process look like? Besides the registration form, it could consist of a liveness test, facial verification and document authenticity checks. A user gets a score for each test, and their overall risk level can be calculated. That's the 'typical' eKYC procedure.

However, there are different methods to liveness detection and not all of them are useful against identity fraud. Each have their own strengths and weaknesses when it comes to security, accessibility and user experience.

A Familiar Challenge

CAPTCHAs are a method of liveness detection capable of deterring simple bots. They're quite common on the web and come in different forms: text-based, image-based or even a simple slider. Google's reCAPTCHA can be a single button click. Matter of fact, it works by tracking your browsing activity and assigning you a risk score. If you're a privacy-conscious user who clears all cookies, you're likely to be labled as high risk. The improved user experience comes at the cost of your own privacy.

Meme making fun of a CAPTCHA. Please laugh.

We've all seen how impressive AI can be. But it's not only getting more impressive, but also accessible to learn and use. While that's great, it's not so great for CAPTCHAs which advanced AIs can easily subvert.

“CAPTCHA tests may persist in this world, too. Amazon received a patent in 2017 for a scheme involving optical illusions and logic puzzles humans have great difficulty in deciphering. Called Turing Test via failure, the only way to pass is to get the answer wrong.” - Why CAPTCHAs have gotten so difficult, The Verge

CAPTCHAs are intended for catching the average spambot. They are not capable of identifying advanced or tailor-made AI. In a process that aims to shield against digital identity fraud, they wouldn't be adequate. So what's the alternative?

Face-based Liveness Detection

Face CAPTCHAs are in fact, a thing. This liveness detection model boils down to requesting a selfie from the user, and applying image processing algorithms to determine whether it's an image of a real person.

To give you an idea of how such a process would work, I built a proof of concept app in React Native, which I will also guide you through creating. The goal here is an introductory level app. In fact, it's not extremely complicated to spoof this implementation as I will demonstrate in the app discussion.

Guide

I value your time so if you want to jump straight to source code, find it here. Feel free to skip ahead to the App Discussion section as well.

Getting Started

Expo makes it easy to prototype and share code for React Native. It's my go-to if I want to build something fast. I also really prefer TypeScript, so the code here is going to have types.

Create a new project using the blank TypeScript template and cd into it:


$ expo init liveness-detection

  ? Choose a template:
  ----- Managed workflow -----
  blank                  a minimal app as clean as an empty canvas
> blank (TypeScript)     same as blank but with TypeScript configuration
  tabs                   several example screens and tabs using react-navigation
  ----- Bare workflow -----
  minimal                bare and minimal, just the essentials to get you started
  minimal (TypeScript)   same as minimal but with TypeScript configuration

We're going to need the following Expo modules for working with the camera, face detection and view masking:


$ expo install expo-camera expo-face-detector
@react-native-community/masked-view react-native-svg

Additionally, we will use this view which abstracts away the circular progress animation:


$ npm i react-native-circular-progress

Now we're ready.


$ expo start

Initial screen with a blank expo project

User Interface

First comes the layout. The following pseudocode represents the final view hierarchy we're going to build:

<View>
  <MaskedView>
    <Camera>
      <AnimatedCircularProgress />
    </Camera>
  </MaskedView>
  <Instructions />
</View>

We will make the camera preview and mask cover the entire screen using absolute fill. For the cutout shape, it depends on the styles for the mask element. We will use the following constant as a reference for its dimensions:

import { Dimensions } from "react-native"

const { width: windowWidth } = Dimensions.get("window")

const PREVIEW_SIZE = 325
const PREVIEW_RECT = {
  minX: (windowWidth - PREVIEW_SIZE) / 2,
  minY: 50,
  width: PREVIEW_SIZE,
  height: PREVIEW_SIZE,
}

Basically, a square at the horizontal center of the screen with a small offset from the top.

minX refers to the left margin and minY to the top margin. So maxX would refer to minX plus the width. I borrowed this naming from native iOS development as it made the most sense.

Note that children to the camera component are drawn on top of the preview, and we want the circular progress to from a ring around the preview cutout. We will need to reference PREVIEW_RECT and PREVIEW_SIZE in several style objects.

import MaskedView from "@react-native-community/masked-view"
import { Camera } from "expo-camera"

import * as React from "react"
import { Dimensions, StyleSheet, Text, View } from "react-native"
import { AnimatedCircularProgress } from "react-native-circular-progress"

const { width: windowWidth } = Dimensions.get("window")

const PREVIEW_SIZE = 325
const PREVIEW_RECT = {
  minX: (windowWidth - PREVIEW_SIZE) / 2,
  minY: 50,
  width: PREVIEW_SIZE,
  height: PREVIEW_SIZE,
}

export default function App() {
  return (
    <SafeAreaView style={StyleSheet.absoluteFill}>
      <MaskedView
        style={StyleSheet.absoluteFill}
        maskElement={<View style={styles.mask} />}
      >
        <Camera
          style={StyleSheet.absoluteFill}
          type={Camera.Constants.Type.front}
        >
          <AnimatedCircularProgress
            style={styles.circularProgress}
            size={PREVIEW_SIZE}
            width={5}
            backgroundWidth={7}
            fill={0}
            tintColor="#3485FF"
            backgroundColor="#e8e8e8"
          />
        </Camera>
      </MaskedView>
      <View style={styles.instructionsContainer}>
        <Text style={styles.instructions}>Instructions</Text>
        <Text style={styles.action}>Action to perform</Text>
      </View>
    </SafeAreaView>
  )
}

const styles = StyleSheet.create({
  mask: {
    borderRadius: PREVIEW_SIZE / 2,
    height: PREVIEW_SIZE,
    width: PREVIEW_SIZE,
    marginTop: PREVIEW_RECT.minY,
    alignSelf: "center",
    backgroundColor: "white",
  },
  circularProgress: {
    width: PREVIEW_SIZE,
    height: PREVIEW_SIZE,
    marginTop: PREVIEW_RECT.minY,
    marginLeft: PREVIEW_RECT.minX,
  },
  instructions: {
    fontSize: 20,
    textAlign: "center",
    top: 25,
    position: "absolute",
  },
  instructionsContainer: {
    flex: 1,
    justifyContent: "center",
    alignItems: "center",
    marginTop: PREVIEW_RECT.minY + PREVIEW_SIZE,
  },
  action: {
    fontSize: 24,
    textAlign: "center",
    fontWeight: "bold",
  },
})

That's about it for the initial UI. But we forgot one important task — we need to handle camera permissions:

const [hasPermission, setHasPermission] = React.useState(false)

React.useEffect(() => {
  const requestPermissions = async () => {
    const { status } = await Camera.requestPermissionsAsync()
    setHasPermission(status === "granted")
  }
  requestPermissions()
}, [])

if (hasPermission === false) {
  return <Text>No access to camera</Text>
}

Face Detector

Let's get down to business. The face detector module integrates with the camera module using props. It's quite straightforward to configure:

import * as FaceDetector from "expo-face-detector"

// `onFacesDetected` callback should be defined inside `App()`.
// We will implement it later.

  <Camera
    style={StyleSheet.absoluteFill}
    type={Camera.Constants.Type.front}
    onFacesDetected={onFacesDetected}
    faceDetectorSettings={{
      mode: FaceDetector.Constants.Mode.fast, // ignore faces in the background
      detectLandmarks: FaceDetector.Constants.Landmarks.none,
      runClassifications: FaceDetector.Constants.Classifications.all,
      minDetectionInterval: 125,
      tracking: false
    }}
  >

faceDetectorSettings can be configured to provide us with different detections based on need. There are many landmarks we could obtain such as the position of the mouth, nose and eyes. By analyzing these kind of points together, we can create the desired expressions and gestures.

Detection Criteria

Let's talk about onFacesDetected callback. This is where all the data processing is going to happen. In order to avoid issues with bad data, we will need to create some rules to make sure that the user is holding the device properly:

There is only a single face in the detection results.
The face is fully contained within the camera preview.
The face is not as big as the camera preview (user is too close to the camera).

It would also be good to verify that the user is looking straight at the device as the fourth step, but I'll leave that for you to try.

In the callback, we're going to recieve results of different faces. This is the type signature for each face detection:

interface FaceDetection {
  rollAngle: number
  yawAngle: number
  smilingProbability: number
  leftEyeOpenProbability: number
  rightEyeOpenProbability: number
  bounds: {
    origin: {
      x: number
      y: number
    }
    size: {
      width: number
      height: number
    }
  }
}

To check condition #2, we need to create a function to determine if one rectangle is within another. We can do that by checking if all corners of the inside rectangle (face) are within the outside one (preview cutout).

Diagram showing rectangular outlines of views

export interface Rect {
  minX: number
  minY: number
  width: number
  height: number
}

interface Contains {
  outside: Rect
  inside: Rect
}

/**
 * @returns `true` if `outside` rectangle contains the `inside` rectangle.
 * */
export function contains({ outside, inside }: Contains) {
  const outsideMaxX = outside.minX + outside.width
  const insideMaxX = inside.minX + inside.width

  const outsideMaxY = outside.minY + outside.height
  const insideMaxY = inside.minY + inside.height

  if (inside.minX < outside.minX) {
    return false
  }
  if (insideMaxX > outsideMaxX) {
    return false
  }
  if (inside.minY < outside.minY) {
    return false
  }
  if (insideMaxY > outsideMaxY) {
    return false
  }

  return true
}

Fun Fact

The iOS and Android SDKs have similar utilities by default on rectangle structures. Though the iOS version works by calculating the union of rectangles.

The Android version is similar to the one we wrote. Since Android is open source, we can read the implementation here. iOS is closed source, but the documentation page hints at how it works:

Return Value
true if the rectangle specified by rect2 is contained in the rectangle passed in rect1; otherwise, false. The first rectangle contains the second if the union of the two rectangles is equal to the first rectangle.

I was curious, so I decided to learn how that works and found the rule to be:

The area of union is the sum of the areas of both rectangles, minus the area of intersection.

These diagrams should help:

The shaded region is the area of intersection. Now consider their union instead, where the shaded area is the area of union:

In the example on the left, the red rectangle is not fully on the inside. The area of union is larger than the black rectangle. On the right side, the area of union is equal to the area of the black rectangle.

We could then rewrite contains like so:

function contains({ outside, inside }: Contains) {
  const outsideMaxX = outside.minX + outside.width
  const insideMaxX = inside.minX + inside.width

  const outsideMaxY = outside.minY + outside.height
  const insideMaxY = inside.minY + inside.height

  const xIntersect = Math.max(
    0,
    Math.min(insideMaxX, outsideMaxX) - Math.max(inside.minX, outside.minX),
  )
  const yIntersect = Math.max(
    0,
    Math.min(insideMaxY, outsideMaxY) - Math.max(inside.minY, outside.minY),
  )
  const intersectArea = xIntersect * yIntersect

  const insideArea = inside.width * inside.height
  const outsideArea = outside.width * outside.height

  const unionArea = insideArea + outsideArea - intersectArea

  return unionArea === outsideArea
}

This is more fun, but the first version is simpler and less code so we'll stick with that!

It's time to implement onFacesDetected. It will handle the detection criteria for now:

// Add new imports
import { Camera, FaceDetectionResult } from "expo-camera"
import { contains, Rect } from "./contains"

...

const onFacesDetected = (result: FaceDetectionResult) => {
  // 1. There is only a single face in the detection results.
  if (result.faces.length !== 1) {
    return
  }

  const face = result.faces[0]

  const faceRect: Rect = {
    minX: face.bounds.origin.x,
    minY: face.bounds.origin.y,
    width: face.bounds.size.width,
    height: face.bounds.size.height
  }

  // 2. The face is fully contained within the camera preview.
  const edgeOffset = 50
  const faceRectSmaller: Rect = {
    width: faceRect.width - edgeOffset,
    height: faceRect.height - edgeOffset,
    minY: faceRect.minY + edgeOffset / 2,
    minX: faceRect.minX + edgeOffset / 2
  }
  const previewContainsFace = contains({
    outside: PREVIEW_RECT,
    inside: faceRectSmaller
  })
  if (!previewContainsFace) {
    return
  }

  // 3. The face is not as big as the camera preview.
  const faceMaxSize = PREVIEW_SIZE - 90
  if (faceRect.width >= faceMaxSize && faceRect.height >= faceMaxSize) {
    return
  }

  // TODO: Process results at this point.
}

For checking whether the user's face is in the camera preview, you'll notice that we created an object faceRectSmaller. The reason is that the face detection rectangle we get is actually as big as the entire head:

Face detection rectangle - better face fit

Modeling State

Before we work on the rest of onFacesDetected, we need to come up with the possible states. Let's note down what the app should do. Describing the process in detail will help us come up with the state model:

User opens the liveness detector screen. They should see a prompt of what to do here.
If there is more than a single face in the preview, we don't proceed.
If the user's face is not in the preview at all, we let them know.
If the user's face is in the preview but it's too close, we let them know.
We want to detect user actions. Those will be:
- Blinking both eyes.
- Turning head to the left.
- Turning head to the right.
- Nodding.
- Smiling.
If the processing conditions are met, we prompt the user to perform an action from the above list.
As the user completes actions, the circular progress fills.
If the user's face leaves the preview after processing starts, we reset the process.
If the user completes all the required actions in sequence, they pass.

Alright. Let's put the above into code. We can start by defining the prompt text to be shown before the processing starts:

const instructionsText = {
  initialPrompt: "Position your face in the circle",
  performActions: "Keep the device still and perform the following actions:",
  tooClose: "You're too close. Hold the device further.",
}

We're also going to need a list of detections that must be performed. Each action has a threshold or a probability to compare against.

const detections = {
  BLINK: { instruction: "Blink both eyes", minProbability: 0.3 },
  TURN_HEAD_LEFT: { instruction: "Turn head left", maxAngle: -15 },
  TURN_HEAD_RIGHT: { instruction: "Turn head right", minAngle: 15 },
  NOD: { instruction: "Nod", minDiff: 1.5 },
  SMILE: { instruction: "Smile", minProbability: 0.7 },
}

The way to determine the thresholds is through trial and error. If a threshold is met for an action in onFacesDetected callback, it means the user is performing that action.

We can track the current action to perform by declaring the actions as an array, which is easy to index and iterate through:

type DetectionActions = keyof typeof detections

const detectionsList: DetectionActions[] = [
  "BLINK",
  "TURN_HEAD_LEFT",
  "TURN_HEAD_RIGHT",
  "NOD",
  "SMILE",
]

The final state model would be:

const initialState = {
  faceDetected: "no" as "yes" | "no",
  faceTooBig: "no" as "yes" | "no",
  detectionsList,
  currentDetectionIndex: 0,
  progressFill: 0,
  processComplete: false,
}

Since we have several pieces of state changing together, it's best to use React.useReducer:

const [state, dispatch] = React.useReducer(detectionReducer, initialState)

interface Actions {
  FACE_DETECTED: "yes" | "no"
  FACE_TOO_BIG: "yes" | "no"
  NEXT_DETECTION: null
}

interface Action<T extends keyof Actions> {
  type: T
  payload: Actions[T]
}

type PossibleActions = {
  [K in keyof Actions]: Action<K>
}[keyof Actions]

const detectionReducer = (
  state: typeof initialState,
  action: PossibleActions,
): typeof initialState => {
  switch (action.type) {
    case "FACE_DETECTED":
      if (action.payload === "yes") {
        return {
          ...state,
          faceDetected: action.payload,
          progressFill: 100 / (state.detectionsList.length + 1),
        }
      } else {
        // Reset
        return initialState
      }
    case "FACE_TOO_BIG":
      return { ...state, faceTooBig: action.payload }
    case "NEXT_DETECTION":
      // Next detection index
      const nextDetectionIndex = state.currentDetectionIndex + 1

      // Skip 0 index
      const progressMultiplier = nextDetectionIndex + 1

      const newProgressFill =
        (100 / (state.detectionsList.length + 1)) * progressMultiplier

      if (nextDetectionIndex === state.detectionsList.length) {
        // Passed
        return {
          ...state,
          processComplete: true,
          progressFill: newProgressFill,
        }
      }

      // Next detection
      return {
        ...state,
        currentDetectionIndex: nextDetectionIndex,
        progressFill: newProgressFill,
      }
    default:
      throw new Error("Unexpected action type.")
  }
}

We calculate the progress fill based on the number of successfully completed detections. We also consider the user placing their face in the preview correctly a successful detection, increasing progress fill (good job 🌟).

Once the user has gone through all detections in detectionsList the process will complete.

Our views remain static. The next step is to make them respond to state changes:

<SafeAreaView style={StyleSheet.absoluteFill}>
  <MaskedView
    style={StyleSheet.absoluteFill}
    maskElement={<View style={styles.mask} />}
  >
    <Camera
      style={StyleSheet.absoluteFill}
      type={Camera.Constants.Type.front}
      onFacesDetected={onFacesDetected}
      faceDetectorSettings={{
        mode: FaceDetector.Constants.Mode.fast,
        detectLandmarks: FaceDetector.Constants.Landmarks.none,
        runClassifications: FaceDetector.Constants.Classifications.all,
        minDetectionInterval: 125,
        tracking: false,
      }}
    >
      <AnimatedCircularProgress
        style={styles.circularProgress}
        size={PREVIEW_SIZE}
        width={5}
        backgroundWidth={7}
        fill={state.progressFill}
        tintColor="#3485FF"
        backgroundColor="#e8e8e8"
      />
    </Camera>
  </MaskedView>
  <View style={styles.instructionsContainer}>
    <Text style={styles.instructions}>
      {state.faceDetected === "no" &&
        state.faceTooBig === "no" &&
        instructionsText.initialPrompt}

      {state.faceTooBig === "yes" && instructionsText.tooClose}

      {state.faceDetected === "yes" &&
        state.faceTooBig === "no" &&
        instructionsText.performActions}
    </Text>
    <Text style={styles.action}>
      {state.faceDetected === "yes" &&
        state.faceTooBig === "no" &&
        detections[state.detectionsList[state.currentDetectionIndex]]
          .instruction}
    </Text>
  </View>
</SafeAreaView>

Let's come back to onFacesDetected. We need to update it so it reflects state by dispatching:

const onFacesDetected = (result: FaceDetectionResult) => {
  // 1. There is only a single face in the detection results.
  if (result.faces.length !== 1) {
    dispatch({ type: "FACE_DETECTED", payload: "no" })
    return
  }

  const face = result.faces[0]

  const faceRect: Rect = {
    minX: face.bounds.origin.x,
    minY: face.bounds.origin.y,
    width: face.bounds.size.width,
    height: face.bounds.size.height,
  }

  // 2. The face is fully contained within the camera preview.
  const edgeOffset = 50
  const faceRectSmaller = {
    ...faceRect,
    width: faceRect.width - edgeOffset,
    height: faceRect.height - edgeOffset,
  }
  const previewContainsFace = contains({
    outside: PREVIEW_RECT,
    inside: faceRectSmaller,
  })
  if (!previewContainsFace) {
    dispatch({ type: "FACE_DETECTED", payload: "no" })
    return
  }

  if (state.faceDetected === "no") {
    // 3. The face is not as big as the camera preview.
    const faceMaxSize = PREVIEW_SIZE - 90
    if (faceRect.width >= faceMaxSize && faceRect.height >= faceMaxSize) {
      dispatch({ type: "FACE_TOO_BIG", payload: "yes" })
      return
    }

    if (state.faceTooBig === "yes") {
      dispatch({ type: "FACE_TOO_BIG", payload: "no" })
    }
  }

  if (state.faceDetected === "no") {
    dispatch({ type: "FACE_DETECTED", payload: "yes" })
  }

  // TODO: Next section
}

Once the user has their face in the preview they should now see a prompt to perform the first action!

Processing Gestures

With the state model done, we will now look into gesture processing. Since we know all the detection actions, a switch statement would be ideal. We will match the current action and check the corresponding thresholds. If the the threshold is met, the user passes and moves on to the next detection.

// onFacesDetected continued.

const detectionAction = state.detectionsList[state.currentDetectionIndex]

switch (detectionAction) {
  case "BLINK":
    // Lower probabiltiy is when eyes are closed
    const leftEyeClosed =
      face.leftEyeOpenProbability <= detections.BLINK.minProbability
    const rightEyeClosed =
      face.rightEyeOpenProbability <= detections.BLINK.minProbability
    if (leftEyeClosed && rightEyeClosed) {
      dispatch({ type: "NEXT_DETECTION", payload: null })
    }
    return
  case "NOD":
  // TODO: We will implement this next.
  case "TURN_HEAD_LEFT":
    // Negative angle is the when the face turns left
    if (face.yawAngle <= detections.TURN_HEAD_LEFT.maxAngle) {
      dispatch({ type: "NEXT_DETECTION", payload: null })
    }
    return
  case "TURN_HEAD_RIGHT":
    // Positive angle is the when the face turns right
    if (face.yawAngle >= detections.TURN_HEAD_RIGHT.minAngle) {
      dispatch({ type: "NEXT_DETECTION", payload: null })
    }
    return
  case "SMILE":
    // Higher probabiltiy is when smiling
    if (face.smilingProbability >= detections.SMILE.minProbability) {
      dispatch({ type: "NEXT_DETECTION", payload: null })
    }
    return
}

One tricky part here is related to the nodding gesture. We need to consider how people don't normally hold their phones perfectly level with their heads like robots. To mitigate this, we will need to normalize roll angle values. We can track the last few values with React.useRef and use their average as the current baseline angle.

const rollAngles = React.useRef<number[]>([])

...

case "NOD":
  // Collect roll angle data in ref
  rollAngles.current.push(face.rollAngle)

  // Don't keep more than 10 roll angles (10 detection frames)
  if (rollAngles.current.length > 10) {
    rollAngles.current.shift()
  }

  // If not enough roll angle data, then don't process
  if (rollAngles.current.length < 10) return

  // Calculate avg from collected data, except current angle data
  const rollAnglesExceptCurrent = [...rollAngles.current].splice(
    0,
    rollAngles.current.length - 1
  )

  // Summation
  const rollAnglesSum = rollAnglesExceptCurrent.reduce((prev, curr) => {
    return prev + Math.abs(curr)
  }, 0)

  // Average
  const avgAngle = rollAnglesSum / rollAnglesExceptCurrent.length

  // If the difference between the current angle and the average is above threshold, pass.
  const diff = Math.abs(avgAngle - Math.abs(face.rollAngle))

  if (diff >= detections.NOD.minDiff) {
    dispatch({ type: "NEXT_DETECTION", payload: null })
  }
  return

...

Note that this will add a delay of minDetectionInterval * 10 before nodding detection can work. An alternative implementation would be preferred in a real app.

To track process completion, we can use an effect (😀). Since the circular progress animation has a default duration of 500ms, we need to consider the last animation before handling completion (e.g. navigating away to another screen).

React.useEffect(() => {
  if (state.processComplete) {
    setTimeout(() => {
      // It's very important that the user feels fulfilled by
      // witnessing the progress fill up to 100%.
    }, 500)
  }
}, [state.processComplete])

You made it. The final source code is available here.

App Discussion

While there are many improvements that could be made, the main thing I would like to talk about is the robustness of the detector.

While this app is more difficult to spoof than a typical CAPTCHA, its processing pipeline is missing two important components:

Distingushing 3D and 2D images (regular image vs. an image of another image).
Image manipulation detection. E.g. deepfakes.

Exploiting the first weakness:

With a high enough resolution screen the process can be spoofed. In the video above, I'm recording myself from a different device and playing the recording on a screen in real time.

Being able to complete the process through a screen is a big no-no. This is the case of detecting a face on an image of another image (the image on the recording device and the image on the screen). Since this succeeded, it wouldn't be hard to spoof with a deepfake as well.

The implementation we have here is only but a single method of liveness detection: movement. There are detectors out there that do address these issues but in turn suffer from others. To distinguish 2D/3D and manipulated images, some detectors train a model on a dataset of authentic and spoofed images. Others analyze image features to infer properties such as depth.

Each approach to face-based liveness detection has its own pros and cons. For example, a model trained to detect spoofed images may require little user collaboration but the model is only as good as its training set. To reduce edge cases, combining techniques is the way to go.

Think of how it would be if we enhanced this detector with the missing features. They could work in the background without any extra input required from the user. Although this would require novel work on our part instead of relying completely on a single library. Maybe for a future tutorial, I could explore that angle as I've never had a better excuse to try TensorFlow.js for React Native.

Conclusion

While the pandemic makes the benefits of going fully digital very apparent, onboarding new users online introduces the risk of digital identity fraud. To mitigate this risk, a digital identity verification procedure must be implemented. Liveness detection is typically part of that procedure. However, while text and image CAPTCHAs are typically used there, face-based liveness detection is a better alternative. Face-based liveness detection makes it harder for someone to use your photo when registering. In spite of that, measures must be taken to ensure that the approach implemented is adequately robust. This can be achieved by combining several liveness detectors into one, as a different approach may work where another fails.

I hope you found this post useful and informative. Have a great day.

Intro to Liveness Detection with React Native

Are You a Robot?

A Familiar Challenge

Face-based Liveness Detection

Guide

Getting Started

User Interface

Face Detector

Detection Criteria

Modeling State

Processing Gestures

App Discussion

Conclusion

Code shower thoughts

JavaScript language tests

Next.js route caching demystified

Write tests before you review a pull request

/uses

`ffmpeg` simple cheatsheet

On caching in Next.js App Router

`next/router` to `next/navigation` migration cheatsheet

At the forefront of React

Thoughts on Expo config plugins

Parsing custom dates in JavaScript

Uploading iOS apps with special characters to App Store Connect

GPT-3: Programming in English

Generating TypeScript types for environment variables

Automatic versioning for React Native apps

Advanced async logic with Redux Observable

Thinking in React for native Android apps

A brief intro to RxJs

Chunking arrays in JavaScript