Towards improving accessibility for visually impaired: non-textual components in digital document and INR currency recognition

Singh, M.

DSpace Home
→
Ph.D Theses
→
Year-2022
→
View Item

dc.contributor.author	Singh, M.
dc.date.accessioned	2024-10-09T04:55:49Z
dc.date.available	2024-10-09T04:55:49Z
dc.date.issued	2022-11-01
dc.identifier.uri	http://dspace.iitrpr.ac.in:8080/xmlui/handle/123456789/4712
dc.description.abstract	There are an estimated 39 million blind persons globally, and around one-third are in India. The accessibility of education is a critical concern for Blind and Visually Impaired People (BVIP). Rarely do Blind and Visually impaired students opt for STEM subjects in mid-level or higher education because of the lack of compatible education material. The main reason is the inaccessibility of Chart Visualizations (CharVis) or Non-Textual components (NTCs) such as charts/plots, tables, and diagrams frequently used in STEM subjects. The inaccessibility of such visualization components exemplifies one of the rife challenges of information access for BVIP. The BVIP, especially in developing countries such as India, generally rely on braille, tactile, BVIP-related assistive technologies (ATs) and/or helper for reading purposes. However, the existing tools and technologies do not much address the accessibility concerns related to NTCs for BVIP. Detecting and identifying the NTCs, extracting the underlying values and other data from the NTCs and the summarization of that data are still very challenging aspects. Besides the NTCs accessibility issue, Currency identification has always been a troublesome task for BVIP, especially in developing countries such as India, and it has become more challenging with the new INR currency notes. BVIP primarily relies on size variations and patterns such as intaglio printings for recognizing the underlying currency denominations. Most of the current Indian legal tenders resemble in size; the engraved designs are not as distinctive as BVIP standards and fade over time. Developing automated paper currency recognition systems is also challenging due to issues such as folded or partial views, uneven illumination, and background clutter. This thesis presents a detailed survey of existing systems for NTCs accessibility and currency recognition tasks. Also, to improve the accessibility scenario, automated pipelines for NTCs understanding and currency recognition are presented. The NTC understanding pipeline’s first stage is to classify NTCs, and a multi-dilated and context-aggregated dense network (MDCADNet) model is proposed in this regard. MDCADNet efficiently caters to the larger receptive field need of the NTC classification task and performs consistently better than the existing approaches on multiple benchmark datasets. Further, a deep learning based effective framework focused on detecting tables (DeepDoT) in digital documents is proposed. The quantitative and qualitative results demonstrate the superiority of the DeepDoT framework. Following the NTCs classification and detectionrelated tasks, a methodology for summarized content extraction from line and bar chart is presented. Besides document accessibility, a robust framework for assisting BVIP in currency recognition is proposed. The framework includes a lightweight Indian paper currency recognition network (IPCRNet), useful in a resource-constrained environment, a large and diverse INR paper currency dataset (IPCD) and a BVIP compatible interface. The proposed method showcases a prominent performance gain over the existing currency recognition approaches.	en_US
dc.language.iso	en_US	en_US
dc.title	Towards improving accessibility for visually impaired: non-textual components in digital document and INR currency recognition	en_US
dc.type	Thesis	en_US