The Wall Street Journal reported on March 8th about discrepancies in Gannett’s digital advertising ecosystem that misrepresented where ads were placed over the course of nine months. Based on the limited information we have at this point, this appears to be a misconfiguration that led to a misrepresentation of domains from a single authorized publisher with limited monetary impact. It does not mimic the characteristics of malicious domain spoofing we typically see by bad actors.
While the time to discovery is concerning for all of us and this is an important learning moment for the industry, we can also reflect on how far we have come. Five years ago we were dealing with domain spoofing operations that spanned millions of domains across tens of thousands of suppliers, contributing to multibillion dollar losses. Ads.txt did important work. Now we are talking about a situation of domain misrepresentation on a limited set of domains from an otherwise reputable publisher. A part of the game in anti-fraud is decreasing the size of the attack surface to minimize the level of opportunity for fraud. Although there is still work to do, we have moved positively in this direction, and that should not be overlooked.
So what does that mean for the programmatic ecosystem? Further collaboration on standards will be important to address this. We’ve been successful in this area as an industry. Ads.txt is a perfect example: it had a meaningful impact. Ads.cert will be important to cement that. We will need to continue to work to define common standards on how domains are represented and how they become transparent.
Domain Spoofing vs Domain Misrepresentation
There are a lot of different ways fraudsters can attempt to defraud advertisers. Domain spoofing is a prevalent fraud model that can come in a couple different flavors (all of which taste bad):
- Bad actors attempt to monetize traffic on domains they don’t own. This is solved with ads.txt since Publisher IDs should be authorized to issue a bid request for a certain domain. Good publishers are usually victims when fraudsters try to make money from their domains, since fraudsters can make money from their premium inventory.
- Bots using fake browsers that declare arbitrary domains as their destination. We saw that a lot with Methbot (spoofed 6,000 premium domains) and 3ve (spoofed 10,000 domains). We saw that pages were empty and only contained ads (commonly known as cashout sites) while the declared domain was from a legitimate publisher. Publishers can either be the victims or the bad actors trying to inflate their inventory. In the cases of Methbot and 3ve, publishers were the victims.
What happened in this particular case is publisher misconfiguration resulting in domain misrepresentation - a subset of domain spoofing, but characteristically not fraud. If this were fraudulent domain spoofing, then ads that were placed would’ve landed on lower value sites, even though an advertiser paid for a higher value site. Thus, the fraudster would generate more revenue. Instead, it wasn’t consistent. It actually worked both ways. In some cases, ads that were supposed to run on lower value sites were actually landing on USA Today. Not a bad trade, but that wasn’t what the advertiser nor what the publisher wanted. The reality is that Gannett could have actually lost revenue from the pattern of misrepresented domains. If this was supposed to be some kind of fraud operation, it was a really bad one. At HUMAN we are constantly saying this: Fraud follows the money. But if the fraudster isn’t making money, it’s probably not fraud. It’s someone who made a mistake, after all we are all human.
Most important question here is whether it is still considered invalid traffic even when it’s not fraudulent or malicious. The short and conservative answer is yes, this is still invalid by definition.
Given the information we have available, this is a case of domain misrepresentation caused by non-malicious misconfiguration. While our technology tracks mismatched domains, we don’t typically flag this as fraudulent without other indicators. There are plenty of legitimate reasons for mismatched domains to happen. Content syndication is a common example. Security protocols in an iFrame are also a typical element that can lead to limited visibility into such mismatches. That said, the obvious gap and, more importantly, time to discovery here is concerning for all of us. It’s time to take a deeper look at what it means to be a reputable publisher and how our standards need to evolve to meet the industry where it is. To this end, HUMAN is exploring a set of transparency metrics that make visible all scenarios where the domain in the bid stream does not match with the domain in the served ad, so that the ecosystem can validate legitimate vs illegitimate use cases.
Visibility and standards help create a supply chain with integrity. We created the Human Collective, alongside agencies, brands, DSPs, SSPs, and other ad tech vendors, to collaborate on larger industry issues around lack of standards and regulations. With this in mind, we recently launched a working group dedicated to improving publisher transparency. We are focusing on the challenges that players in the digital ecosystem face outside of the traditional verification. We are leveraging this group to gather folks together to create a framework for cooperative publisher transparency, such as biddable scoring, or other tools to help with industry challenges such as this. This collaborative effort brings awareness to issues that don’t always land in our purview, but detract from the confidence of the ecosystem. The Human Collective is the vehicle to help propel the industry forward.
While the visibility gap and time to discovery in this scenario are concerning, the industry has made significant progress in recent years, and in this case, we're not faced with a major fraud operation. Yes, we have work to do. But the good news is we've overcome bigger challenges.