Literature Review & Research Gap Analysis

Academic Foundations and Motivation for PhishAID

PhishAID is grounded in extensive academic research on phishing detection, spanning heuristic-based approaches, machine learning techniques, and user-behavior analysis. This section reviews key contributions from prior literature and identifies critical gaps that motivated the design of PhishAID.

The primary objective of this analysis is to justify the need for a transparent, explainable, and deployable phishing detection system aligned with academic and national cybersecurity requirements.

Table 1: Research Gaps Identified from Literature

Ref No.	Research Work	Focus of the Study	Identified Gap	How PhishAID Addresses the Gap
[1]	Mohammad et al. (2014)	Heuristic-based URL rules	Limited to basic heuristics. No explainable output.	Provides rule-wise evidence and transparent scoring.
[2]	Ma et al. (2009)	URL feature analysis	No real-time score visibility or reporting.	Displays per-rule scores and category-wise verdict.
[3]	Khonji et al. (2013)	Survey of phishing attacks	Theoretical focus without implementation.	Implements survey insights into a working system.
[4]	Afroz & Greenstadt	Visual / DOM similarity	High computational cost.	Uses lightweight structural heuristics.
[5]	NCIIPC	National cyber awareness	No technical detection framework.	Provides an academic technical implementation.
[6]	Gabrilovich & Gontmakher (2002)	Homograph attacks	Not integrated into phishing engines.	Integrated as Unicode detection (Rule 21).
[7]	Likhitha Lanke et al. (2025)	ML-based detection	Black-box and non-explainable.	Phase 1 uses explainable rule logic.
[8]	Learning to Detect Phishing URLs (2014)	ML lexical features	Hard to justify decisions.	Simulates logic using fixed heuristics.
[9]	Zhang et al. (2023)	URL + HTML features	Requires page rendering.	Phase 1 avoids HTML loading.
[10]	Alsharnouby et al. (2015)	User behavior study	No detection mechanism.	Implements warning-based verdicts.

Table 2: Rule-Based / Feature-Based Phishing Research Papers

Ref	Authors	Paper Title	Publication	Year
[1]	Mohammad et al.	Heuristic-based phishing detection	IEEE ICC	2014
[2]	Ma et al.	Phishing website detection using URL features	ACM CCS	2009
[3]	Khonji et al.	A survey on phishing attacks	Computers & Security	2013
[4]	Afroz & Greenstadt	PhishZoo	IEEE	2010
[5]	NCIIPC	National Cyber Infrastructure Protection	Govt. of India	—

Table 3: Research Contributions Mapped to PhishAID

Ref	Research Paper	Contribution	Related Rules / Category
[1]	Mohammad et al. (2014)	Validated heuristic URL rules	Rules 2, 3, 4 – Category A
[2]	Ma et al. (2009)	Lexical URL feature logic	Rules 3, 5 – Category A
[4]	Afroz & Greenstadt	Clone phishing inspiration	Rule 18 – Structural
[6]	Gabrilovich & Gontmakher	Unicode normalization	Rule 21 – Identity
[8]	Learning to Detect Phishing URLs	Semantic pattern ideas	Rule 30 – Semantic

Summary

The literature clearly indicates a trade-off between accuracy and explainability in phishing detection systems. While machine learning approaches dominate recent research, they often fail to provide transparent reasoning.

PhishAID bridges this gap by implementing academically validated phishing indicators into an explainable, rule-based system suitable for education, evaluation, and future AI integration.