Literature Review & Research Gap Analysis

Academic Foundations and Motivation for PhishAID

PhishAID is grounded in extensive academic research on phishing detection, spanning heuristic-based approaches, machine learning techniques, and user-behavior analysis. This section reviews key contributions from prior literature and identifies critical gaps that motivated the design of PhishAID.

The primary objective of this analysis is to justify the need for a transparent, explainable, and deployable phishing detection system aligned with academic and national cybersecurity requirements.

Table 1: Research Gaps Identified from Literature

Ref No. Research Work Focus of the Study Identified Gap How PhishAID Addresses the Gap
[1] Mohammad et al. (2014) Heuristic-based URL rules Limited to basic heuristics. No explainable output. Provides rule-wise evidence and transparent scoring.
[2] Ma et al. (2009) URL feature analysis No real-time score visibility or reporting. Displays per-rule scores and category-wise verdict.
[3] Khonji et al. (2013) Survey of phishing attacks Theoretical focus without implementation. Implements survey insights into a working system.
[4] Afroz & Greenstadt Visual / DOM similarity High computational cost. Uses lightweight structural heuristics.
[5] NCIIPC National cyber awareness No technical detection framework. Provides an academic technical implementation.
[6] Gabrilovich & Gontmakher (2002) Homograph attacks Not integrated into phishing engines. Integrated as Unicode detection (Rule 21).
[7] Likhitha Lanke et al. (2025) ML-based detection Black-box and non-explainable. Phase 1 uses explainable rule logic.
[8] Learning to Detect Phishing URLs (2014) ML lexical features Hard to justify decisions. Simulates logic using fixed heuristics.
[9] Zhang et al. (2023) URL + HTML features Requires page rendering. Phase 1 avoids HTML loading.
[10] Alsharnouby et al. (2015) User behavior study No detection mechanism. Implements warning-based verdicts.

Table 2: Rule-Based / Feature-Based Phishing Research Papers

Ref Authors Paper Title Publication Year
[1] Mohammad et al. Heuristic-based phishing detection IEEE ICC 2014
[2] Ma et al. Phishing website detection using URL features ACM CCS 2009
[3] Khonji et al. A survey on phishing attacks Computers & Security 2013
[4] Afroz & Greenstadt PhishZoo IEEE 2010
[5] NCIIPC National Cyber Infrastructure Protection Govt. of India

Table 3: Research Contributions Mapped to PhishAID

Ref Research Paper Contribution Related Rules / Category
[1] Mohammad et al. (2014) Validated heuristic URL rules Rules 2, 3, 4 – Category A
[2] Ma et al. (2009) Lexical URL feature logic Rules 3, 5 – Category A
[4] Afroz & Greenstadt Clone phishing inspiration Rule 18 – Structural
[6] Gabrilovich & Gontmakher Unicode normalization Rule 21 – Identity
[8] Learning to Detect Phishing URLs Semantic pattern ideas Rule 30 – Semantic

Summary

The literature clearly indicates a trade-off between accuracy and explainability in phishing detection systems. While machine learning approaches dominate recent research, they often fail to provide transparent reasoning.

PhishAID bridges this gap by implementing academically validated phishing indicators into an explainable, rule-based system suitable for education, evaluation, and future AI integration.