Get Person Names from Passports by Python

Today, I’ve got something pretty cool to share with you. Ever wondered how you can extract names from passports using a bit of code magic? I wrote a Python script as a function to read a person’s name from a passport image.

Code:: [open on github]

from PIL import Image
import pytesseract
import re
def extract_name_from_id_card(image):
   try:
      image=Image.open(image)
      ocr_text=pytesseract.image_to_string(image).lower().replace(" ", "")
      name_pattern=re.compile(r"p<([^<]+)<<([^<]+)<")
      matches=name_pattern.findall(ocr_text)
      if len(matches) ==1:
        return matches[0][::-1]
      else:
        words=ocr_text.split()
        possible_names= [word for word in words if len(word) >3 and word.isalpha()]
        return possible_names[:2]
   except Exception as e:
      return None

This script is like a mini superhero for reading names off passports. It uses two awesome Python sidekicks: PIL (Python Imaging Library) and Pytesseract. Think of PIL as your image guru and Pytesseract as the text-whisperer.


Tags: