Abstract
Many DNS-based services utilize composite queries, which are queries formed by embedding a referenced domain as a subdomain. For example, example-com.translate.goog refers to example.com as the source domain. The use of composite queries allows services to route users effectively on the Internet; however, these services can also present a security challenge by enabling actors to bypass traditional DNS security controls.
Such services include content proxies (e.g., Google Translate), email security gateways (e.g., Microsoft Outlook Safe Links), content delivery networks, and DNS protocol–based services (such as antivirus lookups and security reputation systems). Malicious actors exploit these mechanisms by passing malicious content through well-known legitimate services, so that the embedded domain is not visited directly from the user’s device, reducing visibility for conventional threat detection systems.
We conducted a comprehensive analysis of services utilizing composite queries in our cloud customer traffic and summarized the purposes and risks they represent. We identified over 100 high-volume services in regular use. Their composite query activity accounts for about 7% of distinct domains, with many embedded domains not visible in direct traffic. Our analysis shows that over 100 known malicious or suspicious domains are embedded in Google Translate queries alone every day.
This blog explains how composite queries are constructed for different purposes and describes how we detect them and accurately extract potentially malicious domains.
Introduction
The Domain Name System (DNS) serves as the Internet’s address book, translating human-readable domain names like www.google.com into IP addresses that computers use to communicate. This fundamental protocol is trusted by network security controls, which attackers may attempt to exploit or bypass.
Domain-embedding services represent a legitimate and widely used class of Internet infrastructure that encodes target domains within DNS query structures. These services create composite DNS queries where the target domain is embedded within the service domain structure, resulting in queries like:
example-com.translate.goog
subdomain.example.com.cdn-service.net
ZXhhbXBsZS5jb20.proxy-service.example
While domain-embedding services serve legitimate purposes—such as translation, email security, content delivery, and privacy protection—they can present security challenges when misused by malicious actors.
The Security Blind Spot
Consider a malicious actor attempting to run a phishing campaign from malicious-domain.com. Under normal circumstances, this domain could be:
- Blocked by protective DNS systems based on threat intelligence
- Flagged by IP reputation systems when connections are established
- Identified through SSL certificate analysis
- Detected by network security monitoring tools
However, when the same malicious domain is accessed through a domain-embedding service as malicious–domain-com.translate.goog, the security landscape changes:
- DNS firewalls observe queries to translate.goog (Google’s trusted domain) rather than malicious-domain.com
- IP reputation systems see connections to Google’s infrastructure (highly trusted) rather than malicious hosting providers
- SSL inspection encounters Google’s valid certificates rather than suspicious or self-signed certificates
- Traditional threat detection fails because all observable indicators point to legitimate, trusted services

Figure 1. An example of a real-world phishing email containing a Google translate link with an embedded malicious website.
This creates a security gap: the embedded domain can bypass traditional DNS-based security controls because the observable DNS traffic, IP addresses, and SSL certificates all belong to the legitimate service domain, not the actual embedded destination. These embedded domains reach client systems through various channels—shared links in emails, messaging applications, web pages, and other content—yet remain hidden from security controls focused on the observable service domain.
Addressing the Gap: A Paradigm Shift in DNS Security
The detection system shifts the security paradigm from “trust the observable DNS domain” to “extract and analyze the embedded destination.” Rather than accepting the service domain at face value, the system:
- Identifies domain-embedding services through analysis of DNS traffic patterns
- Extracts embedded domains using multiple decoding techniques
- Validates extracted domains through statistical conformity analysis against legitimate domain baselines
- Applies independent security analysis to embedded domains using threat intelligence, reputation feeds, and behavioral analysis
- Enables granular policy enforcement where security decisions are based on embedded domain reputation and detected service specialization rather than service domain reputation
This paradigm transforms the security response from binary service-level decisions (“block all translate.goog” or “allow all translate.goog”) to context-aware, content-based policies (“allow translate.goog when embedding legitimate domains; block or alert when embedding known malicious domains”). Security response decisions—whether to block all queries to a service, block individual queries associated with known malicious embedded content, or allow traffic—should be made on a case-by-case basis to avoid disrupting critical security and functional services.
Service Categories and Detection Characteristics
Service Categories
Observed generic domain-embedding services fall into several categories organized by abuse potential and security risk. Each category has distinct security implications while sharing common DNS traffic patterns that enable automated detection:
| Category | Abuse Risk | Description |
| Content Proxy | High | Services that deliver content of the embedded domain, either direct or transformed. This includes translation services, anonymizers, link wrappers, and web archives that fetch and serve content through their infrastructure. Enables malicious actors to hide domains within trusted infrastructure for phishing and malware delivery. Examples: Google Translate (translate.goog), Microsoft Outlook SafeLinks (protection.outlook.com), web archive services. |
| DNS Proxy | High | Services that provide DNS resolution access to embedded domains, returning DNS records for those domains via CNAME records or recursive forwarding. Enables DNS-based hiding of malicious domains. Examples: Public DNS resolvers with embedded query patterns, CNAME-based redirection services for click tracking and campaign management |
| Metadata Services | Low | Services that return information about embedded domains (i.e. reputation scores, threat classifications, or status indicators), rather than delivering the domain’s content itself. These services provide critical security and operational intelligence and should not be blocked. Examples: SURBL (surbl.org), URIBL (uribl.com), antivirus lookup services, reputation systems, domain availability checkers. |
| Static Content | Medium | Domains with wildcard DNS records that return identical or non-specific content regardless of subdomain structure. While queries may appear to contain embedded domains, no actual domain-specific embedding occurs. Requires investigation to distinguish legitimate wildcard services from selective abuse where attackers use CNAME records for targeted domains. Examples: CDN wildcards, domain parking pages, anti-caching mechanisms with random prefixes. |
| Table 1: Categories of domain-embedding services organized by abuse risk and security implications | ||
Common DNS Traffic Characteristics
Despite their diverse purposes and risk profiles, services across all categories share observable DNS traffic patterns that enable automated detection:
Subdomain Structure Patterns:
- High subdomain diversity: Services generate hundreds to thousands of unique subdomains as they handle different embedded destinations
- Structural consistency: Repeated patterns (prefixes, suffixes) appear across all queries to the same service
- Encoding consistency: Each service uses a consistent encoding approach
DNS Query Characteristics:
DNS query characteristics vary between detected services. These characteristics are useful for attribution of services to categories and understanding associated risks:
- Query type distribution: The distribution of query types (A, AAAA, CNAME, TXT, etc.) provides insights into service function
- Infrastructure patterns: Resolution targets and their stability over time
- ASN and hosting analysis: Concentration or distribution of infrastructure across autonomous systems
- Provider reputation: Assessment of hosting infrastructure providers
Volume and Temporal Patterns:
- Sustained activity: Legitimate services show consistent query volumes over time
- Multiple embedded destinations: True domain-embedding services handle diverse target domains, not just single-organization subdomains
- Temporal stability: Service infrastructure and patterns remain stable across observation periods
Table 2 shows representative examples of direct DNS traffic compared to domain-embedding service traffic patterns across different risk categories:
| Category | Example Service | Query Pattern | Pattern Description |
| Direct Traffic | N/A | google.com | Standard domain query |
| Direct Traffic | N/A | mail.example.com | Typical subdomain/ hostname structure |
| Content Proxy (High Risk) | Google Translate | example-com.translate.goog | Embedded encoded domain |
| DNS Proxy (High Risk) | CDN Service | example.com.cdn-provider.net -> example.com | Direct subdomain embedding with CNAME |
| Metadata Services (Low Risk) | Antivirus Lookup | mfrggzdfmztwq2lk.av-service.net | Encoded domain for threat intelligence lookup |
| Table 2: DNS query patterns showing direct traffic vs. domain-embedding service traffic | |||
Methodology: Multi-Stage Detection Pipeline
The detection system operates as a multi-stage pipeline, with each stage building upon the previous one, progressively refining and enriching detections of domain-embedding services.
Stage 1: Service Domain Detection
The first stage identifies service domain candidates by analyzing domains with abnormally high subdomain diversity. For each candidate, the system attempts to decode embedded domains using multiple decoding techniques and detects consistent structural patterns.
To validate decoded domains, the system builds statistical baselines from direct DNS traffic, capturing domain name properties. Decoded domains are scored against this baseline, allowing it to filter out random strings and other composite components.
Candidates are classified by confidence level based on the number of validated embedded domains, destination diversity, encoding consistency, and statistical conformity scores.
Stage 2: Enrichment and Validation
Detected service domains and embedded domains are enriched with external intelligence: domain registration data, historical DNS observations, reputation feeds, SSL certificates, and others. External sources such as WHOIS and historical DNS records confirm that embedded domains are legitimate registered domains rather than random artifacts. Validation filters ensure temporal consistency, encoding consistency, destination diversity, and registration data coverage to reduce false positives.
Stage 3: Aggregation and Trend Analysis
The final stage aggregates validated detections across time periods to track service domain lifecycles, identify growing or declining services, and maintain historical context on domain-embedding service evolution.
Results: Discovery and Analysis of DNS Traffic Blind Spots
The detection system reveals significant gaps in traditional DNS security monitoring and provides insights into domain-embedding service usage patterns.
Discovery of Previously Unobserved Traffic
Analysis of DNS traffic demonstrates the substantial blind spot created by composite queries:
Composite Query Volume: Queries to domain-embedding services produce a sizable portion of DNS traffic, with their composite query patterns representing a measurable fraction of total query volume. This traffic typically bypasses traditional security analysis focused on direct domain queries.
Unique Domain Discovery: Activity associated with embedded domains represents approximately 7% of distinct domain activity in daily traffic. This includes over 10,000 domains per day that are not queried directly at our cloud DNS resolvers. A notable portion of these embedded domains are not observed in direct traffic within extended observation periods.
Service Domain Identification: The system identified approximately 100 domain-embedding services daily, categorized across multiple confidence levels based on embedding patterns, destination diversity, and encoding consistency.

Figure 2. Daily activity of domains using composite queries
Service Distribution: Analysis of domain-embedding traffic over a two-week period reveals significant concentration among a small number of services. Microsoft Outlook SafeLinks (outlook.com) dominates the landscape at 71.7% of observed traffic, followed by Google Translate (5.3%), Cloudflare (2.7%), URIBL (0.74%), and SURBL (0.45%). The remaining 19.1% is distributed across about a hundred smaller services. This distribution pattern remains consistent across all sampling periods, validating the stability of our observations. Figure 3 shows this distribution.

Figure 3. Distribution of embedded traffic between services
Embedded Domain Characteristics
Domain Age Distribution: Analysis reveals a notable presence of newly registered domains among embedded domains. On average, approximately 100 unique domains per day were registered within the past 7 days, and approximately 350 unique domains per day were registered within the past 30 days.

Figure 4. Domains registered within past 7 days (30 days) window
Service-Specific Analysis
Detailed analysis of 3 well-known domain-embedding services demonstrates diverse usage patterns and risk profiles, illustrating different security considerations from open proxies to critical security infrastructure:
Google Translate (translate.goog) – Content Proxy
Among the highest-volume services detected, they process thousands of unique embedded domains daily. Approximately 10,000 unique embedded domains are observed daily, with 6.6% not present in direct DNS traffic on the same day. Queries resolve Google’s infrastructure, fetching and serving content through trusted infrastructure. 0.3% of embedded domains matched threat intelligence feeds (phishing sites, malware distribution, suspicious recently-registered domains)—representing tens to hundreds of malicious domains daily that bypass traditional security controls. While legitimate use cases represent the majority, even a small number of successful attacks can compromise organizations.
SURBL (multi.surbl.org) – DNS Protocol-Based Service
Email security gateways and spam filters generate moderate query volumes to this threat intelligence service. Analysis observed approximately 1,000 unique embedded domains, with 6.5% not appearing in direct DNS traffic. Rather than resolving embedded domains, SURBL returns encoded threat classifications as pseudo-IP addresses in the 127.0.0.x range. As essential security infrastructure, SURBL enables real-time threat assessment—60% of queried embedded domains represent known malicious or suspicious threats. This service should not be blocked, as it is critical to security operations.
Microsoft Outlook SafeLinks (protection.outlook.com) – Email Security Gateway
Very high volume in enterprise environments reflect widespread Microsoft 365 adoption. Approximately 250,000 unique embedded domains were observed, with 0.4% not present in direct DNS traffic on the same day. Queries resolve Microsoft’s protection infrastructure, scanning embedded URLs before redirecting users. 0.2% of embedded domains match threat intelligence feeds—over a thousand malicious or suspicious domains daily. While this represents a lower malicious rate compared to open proxies like Google Translate, the absolute volume demonstrates that significant malicious activity reaches users through this trusted channel. This service is an example of a valid infrastructure that should not be interrupted, but embedded domains represent valuable contributions to threat analysis.
Conclusion
Modern DNS security requires looking beyond observable domains. Our research reveals that thousands of domains daily remain hidden within trusted infrastructure, invisible to traditional security tools focused on direct DNS queries.
Infoblox’s detection system provides visibility into this previously hidden activity. By extracting embedded domains from composite queries and applying independent threat analysis, security teams can now identify and respond to threats regardless of how attackers attempt to obscure them—while preserving the functionality of legitimate services that organizations depend on.
This capability represents a significant advancement in DNS security. Organizations gain comprehensive visibility across both direct and embedded domain activity, enabling context-aware policies that protect users without disrupting business operations. As domain-embedding services continue to grow in usage, this visibility becomes increasingly essential for maintaining effective security postures.

