Real-Time Document Image Retrieval with LLAH

Last update

What's this?

This page explains a new method of real-time document image retrieval which takes as input images captured by a web camera and retrieves their corresponding pages from a large-scale document image database (DB). The core of the method is LLAH \ the algorithm called "Locally Likely Arrangement Hashing" invented in our research group.
A short video of introducing the system is available from here.

What is the task?

The method views document images as a collection of feature points. Thus the task of retrieval is to find the page that has similar arrangement of feature points. Take a look at the images below. The query image is converted to a set of feature points and then matched to feature points from pages in the DB.
Although the task is not so easy for human, machine can easily achieve it with the help of LLAH. Can you find the correct answer?

query image feature points
query pages in the DB

The method with LLAH tries to find a point of a page that corresponds to each point extracted from the query image. The number of points in a query image is about 400, and the number in a page of the DB is about 600. So the number of times of point matching is 2,400,000,000 (= 400 X 600 X 10,000) for a DB including 10,000 pages \ too many to be real-time for a brute-force matching. Note also that

These make matching much harder.

What have been achieved?

The method is characterized by:

Examples of query images correctly recognized by the method are listed below. Original images employed for the retrieval can be obtained by clicking these images.
See videos as well.

perspective distortion partial capture occlusion non-linear deformation

The system allows us not only to find a corresponding page from the DB, but also, for example, to display information on the retrieved page as shown in the augmented reality video. The following figure illustrates this functionality. With the system, pages can be regarded as media to display various information, which may be diagrams, text, still images as well as movies (like a news paper in the movie of Harry Potter). You can also establish a link from a real page to the Internet.


(A larger image is obtained by clicking the above.)

Can I try ?

Yes! If you have a web camera (1.3 M pixels camera is preferable) and a windows computer (either a Dual CPU machine or two computers are preferable), you can use the system available from the following.

Software download

Functions provided by this software are:

In the current distribution, we do not support the function of augmented reality.
The resolution of query and DB images is limited to be 1.3 M pixels or less. Images with higher resolution are reduced automatically in the software.
If you are interested in the software without the limitation, please send us an email to the following address.

Please note that this software is provided ONLY for research purposes. You CANNOT install it to commercial products.
(patent pending; PCT/JP2006/302669, WO2006/092957)
Source code by Tomohiro Nakai(added on May 23,2022)
Source code by Kazutaka Takeda(added on May 23,2022)

For further info.

Who invented this technology?

Tomohiro Nakai
Ph.D. Candidate
Intelligent
Media
Processing
Lab.



Prof. Koichi Kise
Dr. Masakazu Iwamura

Copyright (c) 2006,2007, Intelligent Media Processing Lab., Osaka Prefecture University. All rights reserved.