How do I know if my firewall is actually working?

The only way to know if your firewall works is to test it with real attack traffic. Vendor dashboards report on what the firewall sees, not what it misses. Independent validation testing sends actual exploit payloads, lateral movement traffic and command-and-control beacons through your firewall to verify whether rules enforce as intended. Configuration drift, rule sprawl and expired subscriptions all create gaps that only testing reveals.

Can Darktrace detect all threats?

No security product detects all threats, including Darktrace. Darktrace uses unsupervised machine learning to model normal network behavior and alert on deviations. In our testing, Darktrace performs well against noisy threats and anomalous traffic patterns but consistently struggles with low-and-slow lateral movement, encrypted command-and-control channels and DNS-based data exfiltration that stays within learned behavioral norms.

Should I test my EDR with a penetration test?

A standard penetration test is not designed to validate EDR effectiveness. Penetration tests focus on finding vulnerabilities and achieving objectives, not on measuring detection coverage. Vendor validation testing is purpose-built to evaluate whether your EDR detects specific attack techniques across the MITRE ATT&CK framework. We run controlled tests against your EDR and document exactly which techniques trigger alerts, which get blocked and which pass undetected.

What is vendor security validation?

Vendor security validation is the process of independently testing whether a deployed security product performs as the vendor claims. It involves running real attack techniques against the product in its actual production configuration and measuring detection rates, prevention rates and alert fidelity. This differs from vendor-run testing because it uses adversary-realistic methods rather than the vendor's own test signatures.

How often should I validate my security tools?

Security tools should be validated at least annually and after any major configuration change, firmware update or staff turnover. Environments with regulatory requirements such as PCI DSS or PIPEDA should validate quarterly. Configuration drift is continuous. Rules get added, exceptions get created, subscriptions lapse and staff changes mean institutional knowledge walks out the door. Annual validation is the minimum responsible cadence.

How does vendor security validation differ from a standard penetration test?

Vendor security validation specifically tests whether your existing security tools (firewalls, EDR, SIEM) detect and respond to real attack techniques. A standard pentest finds vulnerabilities in your applications and infrastructure. Vendor validation puts your defensive stack under adversarial pressure using MITRE ATT&CK mapped techniques to measure detection rates, response times and alert fidelity. Most organizations discover 30-60% of attacks bypass their existing tools.

Vendor Validation

Is Your Security Stack Actually Working? We Test It With Real Attacks.

We deploy ShadowTap inside your network and run the same techniques attackers use. If your $200K security stack can not see us, you have a problem.

Vendor validation testing is independent assessment of deployed security products using real attack techniques. Sherlock Forensics tests firewalls, EDR, NIDS and SIEM platforms from Darktrace, CrowdStrike, Palo Alto and seven other major vendors. Our ShadowTap platform operates inside the network to measure detection and prevention rates against adversary-realistic methods. Standard validation starts at $5,000 CAD.

Purchase Validation Book a Scoping Call

The Problem

You Spend Six Figures on Security and Have No Proof It Works

Most organizations spend between $50,000 and $500,000 per year on security products. Firewalls, endpoint detection, network monitoring, cloud security and SIEM platforms all run simultaneously. Vendor dashboards show green. SOC analysts see clean logs. Leadership assumes the investment is paying off. But no one has tested whether these tools actually detect a real intrusion path executed by a skilled adversary.

The gap between what vendors promise and what their products deliver in production is substantial. Products ship with default configurations optimized for ease of deployment. Exclusions accumulate as helpdesk tickets roll in. Subscriptions lapse without anyone noticing. Staff turnover means the engineer who tuned the ruleset two years ago is gone and their successor inherited a configuration no one fully understands. The product your vendor sold you is not the product running on your network today.

We have tested every major security vendor across hundreds of production environments. We know where each product excels and where each one breaks. The results are consistent enough to document. The table below reflects what we find in the field, not in a lab.

Comparison Matrix

Security Vendor Validation: What We Actually Find

Vendor	What It Claims	What We Actually Find	Common Blind Spots	Validation Test Type
Darktrace	AI detects unknown threats	Misses low-and-slow lateral movement	Encrypted C2, DNS tunneling	ShadowTap + manual
CrowdStrike	EDR stops breaches	Strong endpoint, weak network visibility	East-west traffic, BYOD	Endpoint + network
Palo Alto	Next-gen firewall blocks everything	Rule sprawl creates gaps over time	SSL inspection bypass, cloud egress	Firewall audit + ShadowTap
Fortinet	UTM covers everything	Feature interaction causes performance blind spots	Deep packet inspection under load	Config audit + ShadowTap
Cisco	Platform approach covers all vectors	Product silos create visibility gaps	Inter-platform policy inconsistency	Multi-product audit
SonicWall	Affordable enterprise protection	Firmware lag creates CVE windows	SSLVPN exposure, DPI-SSL gaps	External + internal validation
Sophos	Synchronized Security stops threats	Sync breaks when components misconfigured	Endpoint-firewall sync failure	Integration validation
SentinelOne	Autonomous AI protection	Rollback misses encrypted payloads	Policy exclusion sprawl	Ransomware simulation
Check Point	Unified security management	Blade licensing gaps leave features disabled	Threat prevention profile gaps	License + config audit
Zscaler	Zero trust replaces VPN	Split tunneling creates bypass paths	PAC file misconfiguration	ZTNA bypass testing

Based on Sherlock Forensics field testing across production environments. Results reflect common findings, not universal guarantees. Individual deployments vary based on configuration maturity and operational discipline.

Internal Testing Platform

ShadowTap: The Device That Tests Your Entire Security Stack

ShadowTap is a physical device we deploy inside your network. It connects to a standard network port and operates as a rogue device would during an actual intrusion. It performs reconnaissance, attempts lateral movement, establishes command-and-control channels and stages data for exfiltration. Every action maps to documented MITRE ATT&CK techniques. Every action is logged and timestamped. The question is simple: does your security stack see it?

Most external penetration tests never reach the internal network. They test perimeter defenses and stop. ShadowTap starts where penetration tests end. It operates from the position an attacker achieves after initial compromise and tests whether your internal detection and response capabilities function as designed. This is the gap that matters most because perimeter breaches are inevitable and internal detection is what separates a contained incident from a catastrophic one.

When ShadowTap runs against a fully deployed security stack and generates zero alerts, that is a finding. It means an attacker using the same techniques would operate undetected in your environment for as long as they choose. When ShadowTap triggers partial alerts, that is also a finding. It tells you exactly which stages of an attack your tools detect and where coverage drops off. Both outcomes produce actionable intelligence for tuning and remediation.

We do not publish ShadowTap's full technical methodology. The value of the platform depends on testing against defenses that have not been pre-tuned to detect it. For a technical overview of the platform and deployment process, see the ShadowTap deep dive.

Vendor Deep Dives

What We Find When We Test Each Vendor

Palo Alto Networks

What it does well: Palo Alto Networks produces arguably the most capable next-generation firewall on the market. App-ID provides genuine application-layer visibility that most competitors cannot match. Threat Prevention signatures are updated frequently and detection rates against known threats are consistently high. The management interface through Panorama is mature and policy hierarchy is well-structured for large deployments. WildFire sandboxing catches a meaningful percentage of novel malware.

Where it consistently fails in testing: Rule sprawl is the dominant finding. Organizations accumulate thousands of rules over years of operation and no one audits them. We routinely find rules created for temporary projects that were never removed, overly permissive "any-any" rules buried deep in the policy set and SSL decryption exceptions that effectively blind the firewall to encrypted threats. GlobalProtect VPN configurations frequently allow split tunneling that bypasses inspection entirely. Cloud egress rules in environments with hybrid infrastructure are almost always too permissive.

Recommended validation approach: Full firewall rule audit combined with ShadowTap internal testing. We test App-ID accuracy against evasive application traffic, verify SSL decryption coverage, validate zone protection configurations and confirm Threat Prevention profiles are applied consistently across all policy rules. External testing validates GlobalProtect and any published services.

Read the full Palo Alto validation page · Get a Palo Alto validation assessment

SonicWall

What it does well: SonicWall provides genuine enterprise-grade protection at price points accessible to mid-market organizations. The RFDPI (Reassembly-Free Deep Packet Inspection) engine handles high throughput efficiently. SonicOS 7 is a significant improvement over prior generations and the management interface has matured considerably. For organizations that need solid perimeter security without a six-figure firewall budget, SonicWall delivers real value.

Where it consistently fails in testing: Firmware update cadence is the critical issue. SonicWall appliances in the field frequently run firmware versions 12 to 18 months behind current releases. This creates CVE exposure windows that are well-documented and actively exploited. SSLVPN has been the target of multiple critical vulnerabilities in recent years and many deployments still expose it to the public internet with outdated firmware. DPI-SSL (SSL inspection) is frequently disabled because of certificate deployment complexity, which means encrypted traffic passes uninspected. GMS (Global Management System) in multi-site deployments often has default or weak credentials.

Recommended validation approach: External validation first, focusing on SSLVPN exposure and firmware version. Internal validation with ShadowTap to test DPI effectiveness and rule enforcement. Configuration audit to identify default credentials, unused rules and subscription status for threat intelligence feeds.

Read the full SonicWall validation page · Get a SonicWall validation assessment

Fortinet FortiGate

What it does well: Fortinet offers one of the broadest unified threat management (UTM) feature sets in the industry. FortiGate appliances provide firewall, IPS, antivirus, web filtering, application control and sandboxing in a single platform. Hardware-accelerated performance through custom ASICs means features can be enabled without the throughput penalties common on competing platforms. FortiGuard threat intelligence is updated frequently and the Security Fabric concept provides genuine cross-product visibility when properly configured.

Where it consistently fails in testing: The breadth of features creates a specific problem: enabling all UTM features simultaneously causes performance degradation that administrators resolve by disabling features rather than right-sizing the appliance. We regularly find FortiGate deployments where IPS is disabled on high-traffic interfaces, where antivirus scanning is set to flow-based rather than proxy-based (significantly reducing detection) and where SSL inspection is limited to specific categories rather than applied broadly. VDOM (virtual domain) segmentation in multi-tenant environments frequently has inter-VDOM routing that undermines the intended isolation. FortiOS has also had its share of critical CVEs in SSL VPN and management interfaces.

Recommended validation approach: Configuration audit to verify which UTM features are actually enabled and in what mode. ShadowTap internal testing to measure real detection rates with the current configuration under production load. Performance baseline testing to determine whether the appliance is appropriately sized for full UTM feature enablement.

Read the full Fortinet validation page · Get a Fortinet validation assessment

Cisco ASA, Firepower, Meraki and ISE

What it does well: Cisco's security portfolio benefits from deep integration with Cisco networking infrastructure. Firepower provides legitimate next-generation firewall and IPS capabilities. Meraki simplifies management for distributed environments. ISE delivers network access control that integrates with Active Directory and provides posture assessment. For organizations already running Cisco networking, the platform approach reduces integration complexity and provides a single management plane across multiple security functions.

Where it consistently fails in testing: The platform approach creates product silos. ASA, Firepower, Meraki and ISE each have their own policy engines and their own management interfaces. Policies defined in one product do not automatically translate to another. We consistently find organizations where the ASA allows traffic that Firepower should inspect but does not because the policy integration was never completed. Meraki deployments in branch offices frequently operate with more permissive defaults than the centralized Firepower deployment at headquarters, creating inconsistent security posture across sites. ISE posture policies often allow non-compliant devices onto the network with limited restrictions that are insufficient to prevent lateral movement. Firepower in passive mode is a particularly common finding. The IPS is deployed but not enforcing.

Recommended validation approach: Multi-product audit that tests policy consistency across all deployed Cisco security products. ShadowTap testing from multiple network segments to identify gaps between products. ISE posture bypass testing. Firepower enforcement mode verification.

Read the full Cisco validation page · Get a Cisco validation assessment

CrowdStrike Falcon

What it does well: CrowdStrike Falcon is one of the strongest endpoint detection and response (EDR) platforms available. The cloud-native architecture means signature updates and behavioral models are current without requiring on-premises infrastructure. Detection of known malware, fileless attacks on managed endpoints and credential theft techniques is consistently strong. The Falcon OverWatch managed threat hunting service adds a human layer that catches threats the automated engine misses. For organizations that deploy it across all endpoints, CrowdStrike provides genuine protection.

Where it consistently fails in testing: CrowdStrike is an endpoint product. It sees what happens on endpoints where the agent is installed. It does not see network traffic. East-west lateral movement between managed endpoints may be detected through endpoint telemetry, but lateral movement from unmanaged devices (BYOD, IoT, OT systems, legacy servers that cannot run the agent) is invisible. We consistently find environments where CrowdStrike coverage is 70-85% of endpoints, with the remaining 15-30% being devices that cannot run the agent or were missed during deployment. Those unmanaged devices become the attacker's operating base. Policy exclusions created for business applications also accumulate over time and create detection gaps that are difficult to audit through the console alone.

Recommended validation approach: Endpoint validation to test detection against MITRE ATT&CK techniques on managed endpoints. Network validation with ShadowTap to test visibility gaps from unmanaged devices. Coverage audit to identify endpoints without the agent and policy exclusion review to find unnecessary exceptions.

Read the full CrowdStrike validation page · Get a CrowdStrike validation assessment

SentinelOne

What it does well: SentinelOne's autonomous detection and response model is genuinely differentiated. The agent makes local decisions about threat remediation without requiring cloud connectivity, which means response times are measured in milliseconds rather than minutes. The ransomware rollback feature is legitimately useful when it works as intended. Storyline technology provides clear attack visualization that helps analysts understand the full scope of an incident. For organizations without a 24/7 SOC, the autonomous model provides a meaningful layer of protection that does not depend on human response time.

Where it consistently fails in testing: Rollback is the headline feature but it has documented limitations. Encrypted payloads that destroy the Volume Shadow Copy service before encrypting files cannot be rolled back. Some ransomware families specifically target the SentinelOne rollback mechanism. Policy exclusion sprawl is a significant issue because application compatibility exclusions are frequently created during deployment and rarely audited afterward. We find organizations with dozens of exclusions that effectively blind the agent to activity in critical directories. The autonomous model also creates a false sense of security. Organizations assume "autonomous" means "complete" and reduce their investment in detection engineering and threat hunting.

Recommended validation approach: Ransomware simulation testing against the deployed configuration including rollback verification with multiple payload types. Policy exclusion audit. Living-off-the-land binary (LOLBin) testing to validate behavioral detection beyond signature-based identification. Coverage audit for agent deployment gaps.

Read the full SentinelOne validation page · Get a SentinelOne validation assessment

Sophos XG, XGS and Intercept X

What it does well: Sophos Synchronized Security is a genuinely innovative concept. When the firewall (XG/XGS) and endpoint protection (Intercept X) communicate through the Security Heartbeat, compromised endpoints can be automatically isolated at the network level. This provides a response speed that manual SOC processes cannot match. Intercept X provides strong endpoint protection with legitimate deep learning models for malware detection. Central Cloud Management simplifies administration for organizations with limited security staff. For mid-market organizations that deploy both the firewall and endpoint product, the synchronized approach provides real value.

Where it consistently fails in testing: Synchronized Security depends on correct configuration of both the firewall and endpoint components. When the Security Heartbeat breaks and it breaks more often than Sophos documentation suggests, the synchronized response fails silently. The firewall continues operating and the endpoint continues operating, but the automatic isolation that justified the architecture no longer functions. We find heartbeat failures caused by network segmentation changes, firewall firmware updates that break the heartbeat protocol and endpoint agent updates that require heartbeat re-registration. SSL/TLS inspection on XG/XGS is frequently disabled due to certificate trust issues, leaving encrypted traffic uninspected. Web filtering categories are often too broadly permitted because administrators add exceptions for user complaints without reviewing aggregate policy impact.

Recommended validation approach: Integration validation that tests Security Heartbeat functionality end-to-end under realistic conditions. ShadowTap testing to verify that synchronized isolation actually triggers when a rogue device appears on the network. SSL inspection coverage audit. Web filtering bypass testing.

Read the full Sophos validation page · Get a Sophos validation assessment

Check Point NGFW, Harmony and CloudGuard

What it does well: Check Point pioneered the stateful firewall and their current next-generation firewall remains one of the most capable platforms available. Threat Prevention blades (IPS, Anti-Bot, Anti-Virus, Threat Emulation, Threat Extraction) provide thorough protection when fully licensed and properly configured. SmartConsole provides a mature management interface that experienced administrators find efficient. R81.x introduced unified policy management that simplifies what was previously a complex multi-blade policy structure. For organizations with dedicated security teams, Check Point provides granular control that other platforms sacrifice for simplicity.

Where it consistently fails in testing: Blade licensing is the most consistent finding. Check Point's blade architecture means each security function requires a separate license. We regularly find deployments where Threat Prevention blades are not licensed, where Threat Emulation (sandboxing) was purchased but never activated or where blade licenses have expired and the gateway continues operating with reduced protection without generating management alerts. Policy ordering in large rule bases creates evaluation issues where overly permissive rules near the top of the policy short-circuit more restrictive rules below. SmartConsole database synchronization between management servers and gateways occasionally fails, creating a gap between the policy the administrator sees and the policy the gateway enforces. CloudGuard posture management for cloud infrastructure frequently drifts from intended baselines without generating alerts.

Recommended validation approach: License audit to verify all purchased blades are activated and current. Configuration audit to verify Threat Prevention profiles are applied to all relevant rules. Policy optimization review. ShadowTap internal testing to validate actual detection rates. CloudGuard posture drift assessment for cloud deployments.

Read the full Check Point validation page · Get a Check Point validation assessment

Zscaler ZIA and ZPA

What it does well: Zscaler's cloud-native architecture genuinely eliminates the need for traditional VPN in many use cases. ZIA (Zscaler Internet Access) provides consistent web security regardless of user location, which is a real advantage for distributed workforces. ZPA (Zscaler Private Access) provides application-level access without exposing the network, which is a meaningful security improvement over traditional VPN. The zero trust model, when properly implemented, reduces the attack surface substantially. For organizations moving to a cloud-first architecture with a distributed workforce, Zscaler addresses real architectural challenges.

Where it consistently fails in testing: Split tunneling is the dominant finding. Organizations deploy Zscaler to inspect web traffic but exclude specific applications, cloud services or IP ranges from inspection for performance or compatibility reasons. Each exclusion is a bypass path. PAC file configurations on endpoint devices frequently contain errors or legacy entries that route traffic around Zscaler entirely. ZPA application segmentation policies are often too broad, granting access to entire subnets rather than specific applications, which undermines the zero trust model. Client Connector (the endpoint agent) tamper protection is sometimes disabled for troubleshooting and never re-enabled. Organizations that deploy Zscaler alongside legacy VPN infrastructure frequently have both active simultaneously, creating parallel access paths that bypass Zscaler's inspection entirely.

Recommended validation approach: ZTNA bypass testing to identify all paths that circumvent Zscaler inspection. PAC file and Client Connector configuration audit. ZPA application segmentation review. Split tunnel policy analysis. ShadowTap testing from inside the network to validate that east-west traffic is still monitored by complementary network detection tools, since Zscaler does not inspect internal network traffic.

Read the full Zscaler validation page · Get a Zscaler validation assessment

Darktrace

What it does well: Darktrace's unsupervised machine learning approach to network detection is genuinely different from signature-based alternatives. The Enterprise Immune System builds a behavioral model of every device and user on the network and alerts on deviations from that model. This means Darktrace can detect novel threats that have no signature, which is a legitimate capability gap in traditional NIDS platforms. Antigena (autonomous response) can contain threats at network speed without human intervention. For organizations that need network-level visibility without maintaining signature databases, Darktrace provides a distinct approach.

Where it consistently fails in testing: The behavioral model is both Darktrace's strength and its weakness. Low-and-slow lateral movement that stays within the learned behavioral norms of the network does not trigger alerts. If an attacker operates during business hours, uses protocols that are normal for the environment and keeps data transfer volumes within expected ranges, Darktrace's model accepts the activity as normal. Encrypted command-and-control traffic that uses standard HTTPS to legitimate cloud services (a technique we test extensively) is consistently missed because it looks like normal cloud application traffic. DNS tunneling at low bit rates falls below detection thresholds. The training period is also a vulnerability. Attackers who are already present during the initial learning period have their activity incorporated into the baseline as normal behavior. Alert fatigue is a practical issue. Darktrace generates a high volume of informational alerts and organizations without dedicated analysts to tune the system experience significant noise.

Recommended validation approach: ShadowTap deployment combined with manual testing specifically designed to operate within behavioral norms. Encrypted C2 testing. DNS tunneling at various bit rates. Low-and-slow lateral movement over extended time periods. Antigena response validation to confirm autonomous containment functions as configured.

Read the full Darktrace validation page · Get a Darktrace validation assessment

FAQ

Frequently Asked Questions About Vendor Validation

How do I know if my firewall is actually working?: You test it with real attack traffic from an independent party. Vendor dashboards report on what your firewall processes, not what it misses. The only way to measure actual effectiveness is to send exploit payloads, lateral movement traffic and C2 beacons through the firewall and observe what gets blocked, what gets logged and what passes silently. Configuration drift, rule accumulation and expired threat intelligence subscriptions all degrade firewall effectiveness over time. Annual validation is the minimum responsible testing cadence for any production firewall.
Can Darktrace detect all threats?: No single security product detects all threats. Darktrace uses unsupervised machine learning to model normal behavior and alert on deviations. It excels at detecting noisy, anomalous activity that deviates sharply from network norms. It struggles with threats that operate within those norms. Low-and-slow lateral movement, encrypted C2 over standard HTTPS to legitimate cloud services and DNS tunneling at low bit rates are consistently difficult for behavioral models to detect. Darktrace is a valuable layer in a defense-in-depth strategy but it is not a complete detection solution on its own. See our Darktrace testing page for detailed findings.
Should I test my EDR with a penetration test?: A penetration test is not the right tool for EDR validation. Penetration tests measure whether an attacker can achieve specific objectives. EDR validation measures whether the EDR detects specific attack techniques. The goals are different and the methodologies are different. A penetration tester who bypasses your EDR will report the bypass as a finding but will not systematically test 50 MITRE ATT&CK techniques to measure coverage. Vendor validation testing is purpose-built for this. We run controlled technique-by-technique tests and document detection rates across the full kill chain. If you need both, start with a penetration test and follow with vendor validation.
What is vendor security validation?: Vendor security validation is independent testing of whether a deployed security product performs as claimed. It involves running real attack techniques against the product in its actual production configuration and measuring detection rates, prevention rates and alert fidelity. This differs from vendor-provided testing because it uses adversary-realistic methods rather than the vendor's own test signatures. It differs from penetration testing because it focuses on measuring product coverage rather than achieving specific compromise objectives. The output is a detailed report documenting what the product detects, what it misses and what can be improved through configuration changes.
How often should I validate my security tools?: At minimum, once per year. Security tool configurations degrade continuously. Rules get added, exceptions get created, subscriptions lapse and staff turnover means institutional knowledge disappears. Any major change event should also trigger validation: firmware upgrades, infrastructure migrations, acquisitions, major policy changes or staff turnover in the security team. Organizations subject to PCI DSS, PIPEDA or other regulatory frameworks should validate quarterly. The cost of annual validation is a fraction of the cost of the tools being validated and a smaller fraction of the cost of a breach those tools should have prevented.
What is ShadowTap?: ShadowTap is Sherlock Forensics' proprietary internal network testing platform. It is a physical device deployed inside your network that emulates real attacker behavior including reconnaissance, lateral movement, credential harvesting, C2 communication and data staging. Every action maps to documented MITRE ATT&CK techniques. If your NIDS, EDR or SIEM does not alert on ShadowTap activity, a real attacker using the same techniques would operate undetected. ShadowTap starts where external penetration tests end and is the core of our vendor validation methodology. Learn more on the ShadowTap platform page.

Any Vendor

Not listed? We test any security product.

If your vendor is not listed above, contact us. We validate any security product: firewalls, EDR, NDR, SIEM, SOAR, ZTNA, CASB, DLP, email security and cloud security platforms. Same methodology, same pricing, same independence.

Since 2006Independent testingNo vendor affiliations

Get Started

Find Out If Your Security Tools Actually Work

We will deploy ShadowTap on your network and give you an honest answer. No vendor partnerships, no upsell, no bias. Just data. Standard validation starts at $5,000 CAD. Comprehensive multi-vendor validation at $12,000 CAD.

Call 888.883.4550

Ready to Validate Your Security Investment?

Tell us what vendor you run and we will scope a validation assessment. Free scoping call, fixed-price quote, testing typically completed within 5-10 business days.