PhishAID is grounded in extensive academic research on phishing detection, spanning heuristic-based approaches, machine learning techniques, and user-behavior analysis. This section reviews key contributions from prior literature and identifies critical gaps that motivated the design of PhishAID.
The primary objective of this analysis is to justify the need for a transparent, explainable, and deployable phishing detection system aligned with academic and national cybersecurity requirements.
| Ref No. | Research Work | Focus of the Study | Identified Gap | How PhishAID Addresses the Gap |
|---|---|---|---|---|
| [1] | Mohammad et al. (2014) | Heuristic-based URL rules | Limited to basic heuristics. No explainable output. | Provides rule-wise evidence and transparent scoring. |
| [2] | Ma et al. (2009) | URL feature analysis | No real-time score visibility or reporting. | Displays per-rule scores and category-wise verdict. |
| [3] | Khonji et al. (2013) | Survey of phishing attacks | Theoretical focus without implementation. | Implements survey insights into a working system. |
| [4] | Afroz & Greenstadt | Visual / DOM similarity | High computational cost. | Uses lightweight structural heuristics. |
| [5] | NCIIPC | National cyber awareness | No technical detection framework. | Provides an academic technical implementation. |
| [6] | Gabrilovich & Gontmakher (2002) | Homograph attacks | Not integrated into phishing engines. | Integrated as Unicode detection (Rule 21). |
| [7] | Likhitha Lanke et al. (2025) | ML-based detection | Black-box and non-explainable. | Phase 1 uses explainable rule logic. |
| [8] | Learning to Detect Phishing URLs (2014) | ML lexical features | Hard to justify decisions. | Simulates logic using fixed heuristics. |
| [9] | Zhang et al. (2023) | URL + HTML features | Requires page rendering. | Phase 1 avoids HTML loading. |
| [10] | Alsharnouby et al. (2015) | User behavior study | No detection mechanism. | Implements warning-based verdicts. |
| Ref | Authors | Paper Title | Publication | Year |
|---|---|---|---|---|
| [1] | Mohammad et al. | Heuristic-based phishing detection | IEEE ICC | 2014 |
| [2] | Ma et al. | Phishing website detection using URL features | ACM CCS | 2009 |
| [3] | Khonji et al. | A survey on phishing attacks | Computers & Security | 2013 |
| [4] | Afroz & Greenstadt | PhishZoo | IEEE | 2010 |
| [5] | NCIIPC | National Cyber Infrastructure Protection | Govt. of India | — |
| Ref | Research Paper | Contribution | Related Rules / Category |
|---|---|---|---|
| [1] | Mohammad et al. (2014) | Validated heuristic URL rules | Rules 2, 3, 4 – Category A |
| [2] | Ma et al. (2009) | Lexical URL feature logic | Rules 3, 5 – Category A |
| [4] | Afroz & Greenstadt | Clone phishing inspiration | Rule 18 – Structural |
| [6] | Gabrilovich & Gontmakher | Unicode normalization | Rule 21 – Identity |
| [8] | Learning to Detect Phishing URLs | Semantic pattern ideas | Rule 30 – Semantic |
The literature clearly indicates a trade-off between accuracy and explainability in phishing detection systems. While machine learning approaches dominate recent research, they often fail to provide transparent reasoning.
PhishAID bridges this gap by implementing academically validated phishing indicators into an explainable, rule-based system suitable for education, evaluation, and future AI integration.