Having thoroughly analyzed the initial attack vectors and the layered infrastructure of this massive Booking.com phishing campaign in Part 1, we now transition from discovery to deeper investigation. In Part 2, we will consolidate the intelligence gathered from both infrastructure tiers, further unveil the threat actors' operational tactics, and demonstrate how this actionable intelligence can be directly applied to fortify your defenses and proactively hunt for similar threats in your own environment.
Analyzing the whole campaign
The different queries we've made can help us create YARA rules to monitor for new activity from this campaign. In fact, at the end of this blog we have created a section for YARA rules.
But to better understand everything about the campaign, it's also good to know other related information. This includes when this activity started, how many URLs Google Threat Intelligence’s scanning tools identify the URLs as malicious, interesting keywords to monitor, and so on. We will go step by step to get this information.
Obtaining Tier 1 and Tier 2 URLs
First, we need to find all possible URLs related to this campaign. To do this, the following search was made to help us.
entity:url fs:2022-01-01+ (title:"One moment" or title:"AD not found
(captcha2)" or title:"Booking.com | Official") and (meta:"Booking -" or
meta:"https://ltdfoto.ru/images/2025/06/04/photo_2025-06-02_11-23-22.md.jpg" or
meta:"https://cf.bstatic.com/xdata/images/hotel/") and not
(hostname:"booking.com" or redirects_to:booking.com or
redirects_to:placetobe.homes) and not tag:trackers
In this search, it's important to use and not tag:trackers. This helps reduce wrong results from real ad campaigns related to Booking. Another important filter is not to get URLs that redirect to the legitimate Booking website. To avoid outdated results, we also incorporated a time filter.
# Before running the code, make sure you have enough quota to do so. This query can consume a lot of quota.
import vt
cli = vt.Client(getpass.getpass('Introduce your VirusTotal API key: '))
query = "entity:url fs:2022-01-01+ (title:\"One moment\" or title:\"AD not found (captcha2)\" or title:\"Booking.com | Official\") and (meta:\"Booking -\" or meta:\"https://ltdfoto.ru/images/2025/06/04/photo_2025-06-02_11-23-22.md.jpg\" or meta:\"https://cf.bstatic.com/xdata/images/hotel/\") and not (hostname:\"booking.com\" or redirects_to:booking.com or redirects_to:placetobe.homes) and not tag:trackers" # @param {type: "string"}
# Look for the samples
query_results = u]
async for itemobj in cli.iterator('/intelligence/search',params={'query': "%s"%(query)},limit=0): # Set the limit you want
query_results.append(itemobj.to_dict())
all_results = list(json.loads(json.dumps(query_results, default=lambda o: getattr(o, '__dict__', str(o)))))
The previous code snippet stores all the URLs in the all_results variable. Let's now get the details we want.
Timeline
Understanding how big the campaign is helps us know roughly when the actors might have started. We can't know the exact date, but the times URLs obtained from the initial query of this section were uploaded to Google Threat Intelligence give us a very good idea. There were periods when many URLs were uploaded at once, which suggests the actors were very busy then. The next code snippet can give us a Daily Submission Timeline.
from datetime import datetime
import pandas as pd
import plotly.express as px
# Data processing for daily submissions
daily_submissions = u]
for item in all_results:
attritutes = item.get('attributes', {})
timestamp = attributes.get('first_submission_date')
if timestamp:
dt_object = datetime.fromtimestamp(timestamp)
# Extract only the date
daily_submissions.append(dt_object.date())
# Count submissions per day
daily_counts = pd.Series(daily_submissions).value_counts().reset_index()
daily_counts.columns = u'date', 'submission_count']
# Sort by date
daily_counts = daily_counts.sort_values(by='date').reset_index(drop=True)
# Data visualization of daily submissions
fig = px.area(daily_counts, x='date', y='submission_count',
labels={'date': 'Date', 'submission_count': 'Number of Submissions'},
title='Daily Submission Timeline')
# Update layout for better date formatting on x-axis
fig.update_layout(xaxis_title="Date", yaxis_title="Number of Submissions")
# Display the plot
fig.show()
Figure 8: Daily submission timeline suggest huge activity since January 2025
Looking at all the URLs from both Tiers, it seems the patterns we found have been around for a few years. But in January 2025, more spikes of activity began. Specifically, May and June were the busiest months.
Figure 9: Monthly submissions during 2025
Redirections
We wanted to know how many URLs were redirected to a different URL. We could guess from our past results that more URLs were redirected (since there was more Tier 1 infrastructure than Tier 2), but we still wanted to know a percentage to understand it better.
import plotly.express as px
import pandas as pd
# Count URLs with and without redirection
redirected_urls = 0
not_redirected_urls = 0
# Iterate through each item in the results
for item in all_results:
attributes = item.get('attributes', {})
original_url = attributes.get('url')
last_final_url = attributes.get('last_final_url')
# Check if both original_url and last_final_url exist and are different
if original_url and last_final_url and original_url != last_final_url:
redirected_urls += 1
else:
not_redirected_urls += 1
# Calculate the total number of URLs
total_urls = redirected_urls + not_redirected_urls
# Calculate percentages, handling the case where total_urls is 0 to avoid division by zero
percentage_redirected = (redirected_urls / total_urls) * 100 if total_urls > 0 else 0
percentage_not_redirected = (not_redirected_urls / total_urls) * 100 if total_urls > 0 else 0
# Print the results
print(f"URLs with redirection: {redirected_urls} ({percentage_redirected:.2f}%)")
print(f"URLs without redirection: {not_redirected_urls} ({percentage_not_redirected:.2f}%)")
# Create a DataFrame for the pie chart
data = {'Category': r'Redirected URLs', 'Not Redirected URLs'],
'Count': redirected_urls, not_redirected_urls]}
df_redirection_status = pd.DataFrame(data)
# Create the pie chart
fig = px.pie(df_redirection_status, values='Count', names='Category',
title='Percentage of URLs with and without Redirection')
# Update traces to show text inside the pie chart
fig.update_traces(textinfo='percent+label', insidetextorientation='radial')
# Display the plot
fig.show()
Figure 10: Percentage of URLs with and without Redirection
52.6% of the URLs redirect, which is interesting. This is because the other 47.4% seem to be phishing sites where users most likely have to enter their credit card details.
When we look at which domains have the most redirects, it's clear that the domains with the initial pattern booking.confirmation-id 5_numbers].com are seen the most. But, there are also other domains in the Top 10 redirections that don't match this pattern. These are also useful for possible YARA rules to help us watch this activity.
from urllib.parse import urlparse
last_final_urls = o]
for item in all_results:
attributes = item.get('attributes', {})
original_url = attributes.get('url')
last_final_url = attributes.get('last_final_url')
# Only consider last_final_url if it exists and is different from the original_url
if last_final_url and last_final_url != original_url:
last_final_urls.append(last_final_url)
# Extract the domain from each last_final_url
domains = t]
for url in last_final_urls:
try:
parsed_url = urlparse(url)
domains.append(parsed_url.netloc)
except Exception as e:
print(f"Error parsing URL {url}: {e}")
domains.append("Error parsing URL")
# Create a pandas Series from the list of domains and get the value counts
domain_counts = pd.Series(domains).value_counts().reset_index()
domain_counts.columns = d'Domain', 'Count']
# Display the domain counts
display(domain_counts)
Top 10 domains with the most redirects | |
Domain | Count |
booking.id5225211246c.]world | 62 |
booking.confirmation-id9918e.]com | 25 |
booking.confirmation-id901823".]com | 25 |
booking.confirmation-id542l.]com | 15 |
booking.confirmation-id089172".]com | 15 |
booking.id455512201 .]world | 15 |
booking.confirmation-id190238".]com | 14 |
booking.confirmation-id987933".]com | 14 |
booking.confirmation-id4321e.]com | 14 |
booking.confirmation-id89712=.]com | 13 |
These are the final Top 10 domains where most redirects happen from Tier 1 to Tier 2. The first domain booking.id52252112463.]world appears because many URLs on that same domain redirect, but to different parts of the site (different paths) as you can see in the next example.
URL | Final URL |
https://booking.id5225211246y.]world/EC669QWO2 | https://booking.id52252112460.]world/YZJMYDLNV |
https://booking.id5225211246 .]world/EC669QWO2 | https://booking.id52252112460.]world/UT4DOPJPB |
https://booking.id5225211246 .]world/EC669QWO2 | https://booking.id52252112460.]world/ZPOCL8FBK |
https://booking.id5225211246 .]world/EC669QWO2 | https://booking.id52252112460.]world/6WJCSCMOX |
https://booking.id5225211246 .]world/EC669QWO2 | https://booking.id52252112460.]world/Y72WFFHD7 |
https://booking.id5225211246 .]world/EC669QWO2 | https://booking.id52252112460.]world/58H3YTAOO |
https://booking.id5225211246 .]world/EC669QWO2 | https://booking.id52252112460.]world/QHKXP8VB2 |
For other domains, redirects from Tier 1 infrastructure led to different paths for the same domains, indicating that a single domain stored multiple phishing attempts.
URL | Final URL |
https://rsvnmwww.stayicelandt.]com/ | https://booking.confirmation-id9918d.]com/4106029014 |
http://rsvnenom.stayiceland>.]com/ | https://booking.confirmation-id9918i.]com/4831933247 |
http://rsvnuitr.icestayland>.]com/ | https://booking.confirmation-id9918i.]com/4718128210 |
http://rsvnokwc.icestayland>.]com/ | https://booking.confirmation-id9918i.]com/4009187168 |
http://rsvnfbsz.icestayland>.]com/ | https://booking.confirmation-id9918i.]com/4747003708 |
https://rsvnxgnz.icestayland<.]com/ | https://booking.confirmation-id9918d.]com/4548282193 |
https://rsvnjzjp.icestayland<.]com/ | https://booking.confirmation-id9918d.]com/4555634971 |
https://rsvndobm.stayiceland<.]com/ | https://booking.confirmation-id9918d.]com/4562368486 |
Interesting keywords
We used Gemini to look at all the domain names from the URLs we had access to. Our goal was to find keywords that repeat in these domain names. This helps us find patterns to make new detections.
The main keywords identified by Gemini were the following:
keywords = t"booking", "reservation", "reserv", "id", "guest", "hotel",
"confirm", "confrim", "confirmation"]
The fun part is that Gemini gave different options for some words. For example, the keyword reservation was seen as widely used, but it also offered reserv as an alternative. The reason was that some domains included the misspelling reservetion. In the same way, we found domains with confirmed, which explains the choices between confirm, confirmation and confrim.
from urllib.parse import urlparse
import pandas as pd
import plotly.express as px
keywords = m"booking", "reservation", "reserv", "id", "guest", "hotel", "confirm", "confrim", "confirmation"]
keyword_counts_original_url = {keyword: 0 for keyword in keywords}
for item in all_results:
attributes = item.get('attributes', {})
url = attributes.get('url')
if url:
try:
parsed_url = urlparse(url)
netloc = parsed_url.netloc.lower()
for keyword in keywords:
if keyword in netloc:
keyword_counts_original_urlkeyword] += 1
#break # Count each URL only once even if it contains multiple keywords
except Exception as e:
print(f"Error parsing URL {url}: {e}")
# Create a DataFrame for plotting
df_keyword_counts_original_url = pd.DataFrame(list(keyword_counts_original_url.items()), columns='Keyword', 'Count'])
# Create a bar chart using Plotly Express
fig = px.bar(df_keyword_counts_original_url, x='Keyword', y='Count',
title='Count of Original URLs Containing Specific Keywords in Domain Names')
# Display the plot
fig.show()
Figure 11: (image above) keywords identified in the initial URLs.
(image below) keywords identified in the final URLs (redirections).
For the final URLs redirections (figure below), there are 4 main keywords used: booking, id, confirmation, and confirm (which is part of confirmation). These words are used intentionally, as they are on the final domains where victims will enter their information.
On the other hand, the keywords for the initial URLs (redirectors) domains are more varied. For example, hotel, guest, and reserv are also widely used, along with booking.
Detections
When analyzing the overall campaign, it's important to consider the detection rates of the identified URLs by various security vendors. Across all the URLs gathered from both Tier 1 and Tier 2 infrastructure, a significant portion has been flagged with 0 and 1 detections by the security vendors. It underscores a potential gap in current URL detections and highlights the need to improve it.
import pandas as pd
import plotly.express as px
from collections import Counter
malicious_counts = t]
for item in all_results:
attributes = item.get('attributes', {})
last_analysis_stats = attributes.get('last_analysis_stats', {})
malicious = last_analysis_stats.get('malicious', 0) # Default to 0 if not present
malicious_counts.append(malicious)
# Count the occurrences of each malicious count
malicious_count_distribution = Counter(malicious_counts)
# Convert to a DataFrame for plotting
df_malicious_counts = pd.DataFrame(list(malicious_count_distribution.items()), columns='Malicious Count', 'Number of URLs'])
# Sort by the malicious count for better visualization
df_malicious_counts = df_malicious_counts.sort_values(by='Malicious Count')
# Create a bar chart
fig = px.bar(df_malicious_counts, x='Malicious Count', y='Number of URLs',
title='Distribution of Malicious Detections per URL')
# Display the plot
fig.show()
Figure 12: Detections per URL identified.
We've mapped the campaign's vast infrastructure and uncovered its hidden patterns, but what if we could peer directly into the threat actors' operations? Part 3 takes you beyond the infrastructure, revealing a rare glimpse into the files and communications that power this sophisticated phishing scheme.