Adoption Guide: Investigating Suspicious Phishing Look-Alike and Copycat Domains - Part 1

Forum|Forum|4 months ago
October 22, 2025
0 replies
646 views

Digital-Customer-Excellence
Staff

Summary

This Adoption Guide outlines procedures for investigating suspicious, phishing, look-alike and copycat domains once they are discovered using Domain Protection tools such as Digital Threat Monitoring (DTM) within Google Threat Intelligence. The purpose of this Adoption Guide is to help customers establish a procedure and process for analyzing and responding to these domains and how to request takedowns, (or partner with specialized take down services) for these suspicious domains once they are identified by tools such as DTM. Additionally, this guide will also highlight some additional tools which can be utilized within Google Threat Intelligence.

This adoption guide & related analysis covers four phases:

Initial Triage and Information Gathering: Collecting initial data from alerts and user reports, and preliminary assessment of phishing indicators.
Deep Dive Domain and Infrastructure Analysis: Analyzing domain names, typosquatting techniques, WHOIS information, DNS records, and SSL/TLS certificates.
Website Content and Maliciousness Assessment: Safe content review using online tools, IP address reputation and hosting provider identification, and leveraging threat intelligence feeds.
Takedown Determination and Action: Consolidating evidence, assessing risk, collecting evidence for takedown, and the reporting and takedown process.

Introduction: Understanding the Phishing Threat Landscape

Phishing remains a pervasive and evolving threat in the digital landscape, with adversaries constantly refining their tactics to circumvent traditional security measures. A particularly insidious form of this threat involves the creation and deployment of look-alike or copycat domains. These domains are meticulously crafted to closely mimic legitimate brand websites, aiming to deceive users into divulging sensitive information, such as login credentials or financial details, or to facilitate the download of malicious software. The deceptive nature of these domains often relies on subtle alterations, including various forms of typosquatting substitutions such as go0qle.com, g00g1e.com, googlle.com, goorgle.com, or even the strategic use of subdomains, combo-squatting, or apex domain manipulation including domains like google-careers.com, googlesignin.com, googlepurchase.com, googlekr.com to enhance their perceived authenticity. The primary objective behind such sophisticated impersonations is to establish a false sense of familiarity and trust, thereby inducing victims to interact with the fraudulent online entity.

The sophistication of phishing lures has advanced significantly. While traditional indicators like grammatical errors, awkward phrasing, or obvious misspellings were once reliable red flags, the advent of AI has enabled threat actors to generate phishing content with nearly perfect grammar and spelling. This development diminishes the reliability of superficial & visual content checks, necessitating a fundamental shift in how security operations centers and intelligence analysts approach investigations. The focus must now pivot towards more technical indicators, behavioral analysis, and infrastructure based detection, as AI generated phishing content will increasingly be indistinguishable from legitimate communications based solely on graphic and linguistic quality. This illustrates the imperative for a robust, multilayered investigation approach that does not solely rely on easily falsifiable visual cues. A comprehensive review and analysis strategy is essential to effectively identify and mitigate the risks posed by sophisticated phishing and copycat domains. Relying solely on visual inspections can lead to critical oversights, as malicious actors are adept at creating convincing but deceptive appearances. Responding to these attacks requires a multifaceted approach incorporating technical analysis, threat intelligence, and behavioral patterns recognition is crucial for accurate detection and proactive defense against these evolving domain and brand protection threats.

The importance of rapid investigation and response cannot be overstated, primarily due to the inherently short operational lifespan of malicious domains. This "flash in the pan" operational model is strategically employed by threat actors to maximize their success rate during the initial phase of domain registration, exploiting the brief window before detection and remediation efforts can fully materialize. Timely domain discovery is therefore critical to thwarting phishing and fraud campaigns before they can inflict substantial harm. This extremely short operational window directly impacts the efficacy of takedown efforts, rendering slow, manual incident response processes largely ineffective. A delay of even a few hours can mean the malicious domain is already abandoned by the attacker. This unfortunate reality mandates a strong emphasis on automation, proactive monitoring, and predictive intelligence within the investigation procedure, moving beyond purely reactive measures to anticipate and neutralize threats before they mature.

A "Domain Protection” alert in the context of look-alike phishing often refers to an external brand protection or domain monitoring service, such as Digital Threat Monitoring (DTM), included in Google Threat Intelligence. This service proactively scans and ingests domain & certificate registration information from the internet which closely resembles an organization's brand, providing the earliest possible detection of spoof domains upon their creation and registration.

Phase One: Initial Triage and Information Gathering

A. Alert Reception and Initial Data Collection

Upon receiving an alert regarding a suspicious domain, the immediate priority is to gather all available initial data. The nature of this data will vary depending on the source of the alert.

From Domain Protection Alerts: In this context, we are primarily looking at investigating look-alike phishing domains in which a Digital Threat Monitoring (DTM) service has identified a newly registered domain impersonating the organization. Google Threat Intelligence’s DTM is crucial for early detection of spoof domains, providing a critical initial intelligence source for look-alike phishing investigations.

Alert from DTM indicating a similar and suspicious domain was found. How would you respond?

B. Safe URL Inspection Techniques

A critical safety measure dictates that suspicious URLs should never be visited directly in a standard web browser on a production machine. Direct interaction carries an immediate risk of malware infection, driveby downloads, or system compromise. Instead, specialized online URL scanners and sandboxing services must be utilized for safe analysis. These tools execute the URL in an isolated environment, providing screenshots, phishing/scam classifications, and technical details (hosting provider, IP address, domain age) without exposing the analyst's workstation to risk. Google Threat Intelligence’s has two functionalities to help with this, first is “Check with VirusTotal” which searches the publicly available repository of information about websites which have been seen and scanned by the VirusTotal engine.

Submitting a domain to ‘Check with VirusTotal’ within Google Threat Intelligence

The other solution for Google Threat Intelligence subscribers is to use the “Private Scanning” feature, which does a similar scan as performed by the “Check with VirusTotal feature”, but does not share the results with the public. Using the private scanning function is great for visiting websites safely in a sandbox and getting a report regarding the behaviors of the website.

Submitting a suspicious URL to Google Threat Intelligence Private Scanning Sandbox

The inherent risk of drive-by downloads, malware execution, or exploit kit delivery means that direct interaction with a suspicious domain poses an immediate threat to the investigator’s machine and, by extension, your network. This is why incorporating sandboxing and remote analysis tools is not an optional enhancement but a foundational requirement for a secure and effective phishing investigation procedure.

C. Preliminary Visual Assessment of Phishing Indicators

Following initial data collection, a preliminary assessment of the suspicious domain and associated communication is conducted to identify common phishing indicators. This step helps in rapidly prioritizing and categorizing the threat.

The table below summarizes common phishing indicators, providing a rapid reference guide for analysts during the initial triage phase. While we initially cautioned on solely relying on visual indicators, it is where our manual analysis and investigation starts.

Common Phishing Indicators (Visual & Content Based)

Indicator Category	Specific Clue	Description/Example	Reliability
Content/Tone	Urgent/Threatening Language	"Act now," "Account will be closed"	High
	Too Good to Be True Offer	"Free gift," "Win a trip"	High
	Grammar/Spelling Errors	Poor English, typos (less reliable with AI)	Medium/Low
Links/Call to Action	Mismatched Link URL	Visible text differs from hover over URL	High
Links/Call to Action	Vague Call to Action	Generic buttons, such as, 'Click here'	Medium
Website Visuals	Low Quality Graphics	Pixelated logos, blurry images	Medium
	Awkward Design/Layout	Hard to navigate, missing sections	Medium
	Missing Contact Info	No "About Us" page, generic contact form	High
Security/Payment	Request for Personal Info	Asking for passwords/PII	High
	Bypass Protocols	Asking to circumvent company security procedures	High
	Non Reversible Payments	Only bank transfer, gift cards, crypto	High
SSL/TLS	HTTPS Present	Padlock symbol (use with caution, a majority of phishing sites now use SSL)	Low
SSL/TLS	Suspicious Certificate Details	Generic issuer, short validity, mismatch with brand	Medium

Phase Two: Deep Dive Domain and Infrastructure Analysis

Once preliminary indicators suggest a phishing attempt, a deeper technical analysis of the domain and its underlying infrastructure is essential. This phase aims to uncover definitive evidence of malicious intent and map the adversary's operational footprint.

A. Domain Name Analysis

Identifying Typosquatting and Deceptive Domain Names: A thorough examination of the domain name is paramount. This involves scrutinizing it for subtle misspellings of legitimate brand names (goorgle.com.com, googlle.com.com), the addition of extra words (googlesupport.com), or the use of different Top Level Domains (TLDs) (.net instead of .com). These seemingly minor alterations are designed to trick users who may not scrutinize URLs closely. Particular attention should be paid to the strategic and manipulative use of subdomains. Phishing sites frequently place a trusted brand name within a subdomain (google.fake-site.com) to create an illusion of legitimacy, while the actual malicious domain is the primary domain (fake-site.com) controlled by the threat actors.

Reviewing Top Level Domains (TLDs) and Subdomains: It is important to recognize that certain TLDs have a higher statistical association with malicious activities. Threat actors strategically select TLDs and leverage subdomains to enhance the credibility of their phishing lures and evade detection by basic URL filters. Additionally, certain TLDs cost less money compared to others in registration fees, and this can be one of the reasons threat actors may choose to use a .xyz domain, instead of one ending in .io or .ai. This implies that a comprehensive investigation must extend beyond merely checking the apex (main) domain. Analysts need to scrutinize the entire domain structure, including the TLD and any subdomains, and potentially cross reference them against known lists of malicious TLDs or common phishing patterns. Organizations are strongly encouraged to maintain an updated inventory of all legitimate subdomains and continuously monitor for unauthorized or newly created subdomains, as this is crucial to detect potential subdomain takeovers or malicious usage.

B. Understanding Typosquatting Techniques

Threat actors employ various typosquatting techniques to create deceptive, look-alike domains mimicking legitimate ones. Familiarity with these methods empowers analysts to better recognize, analyze and triage Domain Protection alerts from DTM.

Observed Typosquatting Techniques & Examples

Typosquatting Technique	Definition	Example (Base: example.com unless specified)
Omission	A character is removed from the domain.	example.com becomes exmple.com (missing 'a')
Addition	An additional character is inserted into the domain.	example.com becomes exxample.com (extra 'x')
Substitution	A character is replaced by another, often visually similar or adjacent on a keyboard.	example.com becomes exampl3.com (3 for e) or exqmple.com (q for a on QWERTY keyboard)
Transposition (Character Swap)	Two adjacent characters in the domain name are swapped.	example.com becomes exmaple.com (m and a swapped)
Hyphenation	A hyphen is added or removed within the domain name.	example.com becomes ex-ample.com or exam-ple.com
Homoglyph	Characters are substituted with visually similar characters, potentially from different character sets.	google.com becomes goog1e.com (digit '1' for lowercase 'l') or gooqle.com (Cyrillic 'q' for Latin 'g')
Missing Dot	A dot separating parts of the domain name is removed (in subdomains).	www.example.com becomes ww.example.com or exam.ple.com becomes example.com.
Missing Dashes/Strip Dashes	All or some hyphens within the domain are removed.	my-brand.com becomes mybrand.com
Character Omission (general)	Each character in the domain is iteratively omitted.	example.com produces xample.com, eample.com, exmple.com, etc.
Adjacent Character Insertion	A character adjacent to an existing one on a keyboard layout (QWERTY) is inserted.	example.com might become erxample.com (r is adjacent to e on QWERTY) or exqmple.com (q is adjacent to a).
Singular/Pluralise	Adding or removing an 's' to make the domain singular or plural.	product.com becomes products.com brands.com becomes brand.com
Character Repeat	A character within the domain is duplicated.	example.com becomes exammple.com (double 'm')
Bitsquatting	A single bit error is simulated in the ASCII representation of a character, resulting in a different character.	example.com might become exampie.com (a bit flip in 'l' could result in 'i')
Wrong Top Level Domain (TLD)	The domain's Top Level Domain is replaced with another common TLD.	example.com becomes example.org or example.net
Wrong Second Level Domain	For multipart TLDs (ccTLDs), the second level domain is changed.	example.co.uk becomes example.org.uk
Wrong Third Level Domain/Subdomain	A dot is inserted into the domain name to create a subdomain, or an existing subdomain is altered.	example.com becomes ex.ample.com or www.example.com becomes ww.example.com
Ordinal Number Swap	Numbers in the domain are converted to their word equivalent, or vice versa.	top10.com becomes toptenth.com; firstchoice.com becomes 1stchoice.com
Combosquatting (Keywords)	Common keywords related to security, support, login, etc., are appended to the brand name.	paypal.com becomes paypal-security.com or paypal-login.com
Addition (general)	Any character is added to the domain name.	example.com could become example-login.com, exampleweb.com, etc.
Add Dash	A hyphen is inserted at various positions within the domain name.	example.com becomes e-xample.com, ex-ample.com, exampl-e.com
ChangeDotDash	A dot in the domain name (often in subdomains) is replaced with a hyphen.	sub.example.com becomes sub-example.com
Replacement (keyboard layout)	Each letter is replaced with letters to the immediate left and right on the keyboard (QWERTY).	On a QWERTY keyboard, example.com might produce exqmple.com (q adjacent to a), ezaample.com (z adjacent to a), etc.
Add TLD	An additional Top Level Domain is inserted before the legitimate TLD.	example.com becomes example.com.it or examle.com.ru
Common Misspellings	Words in the domain are replaced with their common misspellings.	calendar.com becomes calender.com
Homophones	Words in the domain are replaced with words that sound phonetically similar but have different spellings.	write.com becomes rite.com or right.com

C. WHOIS & RDAP Information Review

WHOIS records are publicly available databases containing critical information about registered domain names and their owners. Accessing both current and historical WHOIS data is a fundamental step in investigating suspicious domains. The internet is undergoing a significant shift in how domain name registration data is accessed, moving away from the long-standing WHOIS protocol to the more modern and secure Registration Data Access Protocol (RDAP). This transition addresses the inherent limitations of WHOIS and offers a more robust and feature-rich solution, such as moving to JSON instead of unstructured plain text. The full details of WHOIS compared to RDAP is out of scope for this guide, but please note that they provide similar information about who registered a domain.

Accessing Current and Historical WHOIS Records: Google Threat Intelligence can help you with retrieving comprehensive & current WHOIS data. This functionality can provide not only the current registration details but also a historical timeline of changes in ownership, contact information, and registration specifics over time. Historical records are particularly useful for tracking the evolution of a domain's registration and identifying patterns of malicious reuse.

Viewing a domain’s RDAP and WHOIS information in Google Threat Intelligence

Suspicious Indicators in WHOIS: Several indicators within WHOIS records can signal malicious intent:

Newly Registered Domain (NRD): A domain that has been very recently registered is a strong indicator of potential maliciousness. Many phishing attacks are launched in the early days of a domain's existence to maximize impact before detection and takedown efforts can be initiated.
Unusual Registrar: While not a definitive indicator on its own, some registrars are statistically associated with a higher volume of malicious domain registrations. This can be a contributing factor when combined with other suspicious elements. Deviation from a standardized registrar for your organization’s domains can be a red flag and a suspicious indicator to follow up on.
Mismatched WHOIS Details: Significant discrepancies between the WHOIS records of the suspicious domain and those of the legitimate entity it is impersonating are clear red flags. This includes differences in registrant names, organizations, contact information, or geographical locations.
Privacy Protection Services: Domain privacy services replace the domain owner's contact information with substitute details from a privacy partner. While a legitimate service for privacy conscious users, the presence of domain privacy on a suspicious look-alike domain, particularly one impersonating a major brand, should raise a high degree of suspicion. This is a common tactic employed by threat actors to obscure their true identities and evade accountability. This dual nature of domain privacy means that its presence on a suspicious domain necessitates a pivot to alternative investigative avenues, such as IP analysis, passive DNS, or certificate transparency, to circumvent the obfuscation provided by privacy services.

Key Information from WHOIS Records and Suspicious Indicators

WHOIS Field	Legitimate Expectation	Suspicious Indicator
Registrant Name	Matches legitimate organization/individual	Generic name, individual for corporate brand, or mismatch
Registrant Organization	Matches legitimate corporate entity	Generic organization, or mismatch
Registrant Email	Professional email associated with legitimate entity	Public domain email (Gmail or ProtonMail), or mismatch
Creation Date	Established, older date, consistent with brand history	Very recent (Newly Registered Domain)
Expiration Date	Standard registration period (1 year, multiple years)	Very short registration period (90 days or 1 year only)
Registrar	Reputable, well known registrar	Unusual or less common registrar (potentially associated with abuse)
WHOIS Privacy Status	Optional for individuals, less common for large corporations	Privacy enabled for a look-alike corporate brand domain

D. DNS Records and Associated Infrastructure (Passive DNS)

Analyzing Domain Name System (DNS) records provides fundamental information about a domain's network configuration and can reveal crucial links to malicious infrastructure.

Retrieving Current and Historical DNS Records: DNS records contain essential data about a domain's associated IP addresses and mail servers. Below is an outline of common DNS records, how they are used, and what relevance they may have within an investigation.

DNS Record Types and How to Use as Part of Suspicious Domain Investigation

DNS Record Type

Description / Use

Importance for Investigations

(IPv4 Address)

Maps a domain name (like google.com) to an IP address.
The 'A' record is for IPv4 addresses.

Example: 142.250.191.78

The IP address tells you where the suspicious website is physically hosted.

This points you to the hosting provider

AAAA

(IPv6 Address)

The 'AAAA' record is for the newer IPv6 addresses.

Example:

2001:db8:85a3::8a2e:370:7334

Similar to IPv4 Addresses.

You can perform a reverse IP lookup on the discovered IP address. This can reveal other domains hosted on the same server, often uncovering a larger network of suspicious sites operated by the same attacker.

MX
(Mail Exchange)

The MX record specifies the mail servers responsible for accepting emails on behalf of a domain.

It tells the internet where to send an email addressed to @example.com

It reveals what service the phishers are using to receive emails.

A phishing domain with a valid MX record may indicate the attacker intends to receive replies from victims

NS
(Name Server)

The NS record delegates a domain to a set of authoritative name servers. These are the servers that hold all the official DNS records for that specific domain.

This record tells you who is managing the domain's DNS settings (Cloudflare, GoDaddy, Namecheap, etc).

This is a crucial piece of information for reporting the domain for takedown.

CNAME
(Canonical Name)

A CNAME record acts as an alias, mapping one domain name to another.

For example, www.test.com could be a CNAME that points to test.com

Attackers use CNAMEs to obfuscate the final destination.

A seemingly benign subdomain like login.customer-portal.com might be a CNAME pointing to a known malicious domain like malicious-hosting-provider.net. Following the CNAME trail reveals the true source of the threat.

TXT

(Text)

A general-purpose record that holds free-form text.

It's commonly used for domain verification and, most importantly, for email authentication standards.

Legitimate domains have TXT records for SPF, DKIM, and DMARC to prevent email spoofing.

SOA
(Start of Authority)

This record contains important administrative information about the DNS zone, including the primary name server, the email of the domain administrator, and timers for how often the zone is updated.

The administrator's email, while often fake on phishing domains, can sometimes provide a lead.

The "serial number" in the SOA record is often formatted with the date (YYYYMMDD##).

This can give you a clue as to when the phishing domain's DNS records were last changed, helping to build a timeline of the attack.

PTR
(Pointer Record)

A PTR record does the reverse of an A record. It maps an IP address back to its designated domain name (a process called reverse DNS).

A PTR record can help confirm if an IP address belongs to a legitimate service or is part of a dynamically assigned block of addresses often abused by attackers.

Passive DNS: Passive DNS (pDNS) data is particularly valuable as it provides historical resolutions, enabling investigators to map connections between domain names, IP addresses, and nameservers over extended periods. The relations tab is where you go to find passive DNS information for domains in Google Threat Intelligence.

Viewing Passive DNS (pDNS) records for a domain within Google Threat Intelligence

Identifying Associated IP Addresses, Nameservers, and Mail Servers: Once IP addresses associated with the suspicious domain are identified, it is critical to perform IP lookups to determine their geolocation, Autonomous System Number (ASN), and Internet Service Provider (ISP). This information helps in understanding the hosting environment and potential jurisdiction. The IP's reputation should also be checked to determine if it is blacklisted. Comparing the Mail Exchange (MX) records and nameservers of the suspicious domain with those of the legitimate target domain is a strong indicator of malicious intent if discrepancies are found.

Mapping Adversary Infrastructure and Shared Hosting: Passive DNS is a powerful tool for identifying shared infrastructure. By performing reverse IP lookups, analysts can uncover other domains hosted on the same IP address or utilizing the same nameserver, which can reveal broader threat actor networks and campaigns. Passive DNS functions as a network cartography tool, enabling investigators to uncover entire malicious infrastructures, identify shared hosting environments, and track the evolution and migration of an adversary's online presence. This capability is crucial for moving beyond isolated single domain takedowns to disrupting broader phishing campaigns and proactively anticipating future attacks by identifying common patterns in infrastructure setup and reuse.

Analyzing Domain Lifespan and Activity Patterns: Passive DNS data can provide timestamps indicating when an IP address was first observed in association with a domain. This offers valuable insights into the operational lifespan of malicious domains, often confirming the very short lifespan for domains observed in many phishing campaigns.

E. SSL/TLS Certificate Analysis

The ‘Relations’ tab within Google Threat Intelligence (as previously shown), also includes information regarding the certificates for the suspicious domains. Additionally, this tab also shows subdomains, URLs associated with the domain, files downloaded from that domain, and files communicating with the domain which has been reported to Google Threat Intelligence.

SSL/TLS certificates, while designed for security, can also provide critical clues in phishing investigations, particularly through Certificate Transparency (CT) logs. CT is an internet security standard that publicly records all digital certificates issued by publicly trusted Certificate Authorities (CAs). This system provides a mechanism for monitoring and auditing certificate issuance. Tools like crt.sh (a Certificate Transparency Log search engine) allow analysts to search for all certificates issued for a given domain. This enables the identification of any certificates that do not align with legitimate requests from the organization, which should be flagged for further investigation. CT logs provide a powerful proactive mechanism for brand protection that extends beyond traditional reactive domain monitoring. By continuously scanning these public logs for unauthorized certificate issuances for legitimate or look-alike domains, organizations can detect phishing infrastructure before it becomes fully operational or widely distributed.

Looking for Company Name Reuse and Typosquatted Variations in Certificates: When examining certificate logs, investigators should specifically look for unexpected certificate issuances, the reuse of company names, or common typosquatted variations within the certificate's Subject Alternative Names (SANs). Additionally, malicious certificates may sometimes exhibit unusually short validity periods (only three months), which can serve as another red flag. As an example, the free certificate service LetsEncrypt provides free certificates which last for only 3 months. While legitimate organizations may utilize LetsEncrypt, it is a Certificate Authority service which is often abused by threat actors due to its appealing price point.

Click here for part 2 of this adoption guide.