Log Analysis: Do Lead Generation Platforms work?

Jan 1, 2023

Tony Perez (@perezbox)

How Log Analysis Can Be Used for Marketing Attribution

Trunc is a B2B brand which means that when it comes to marketing we technically have a few tools that should pay dividends. One category of these tools is lead generation platforms. These solutions are designed to create marketplaces, which in theory should be ripe with users interested in your solution. But how effective are they?

We recently decided to test one, Capterra. Capterra is a very well-known software marketplace specifically designed for B2B brands. If you do a simple Google Search for "best log management tools" they rank #8 in the organic results. They do this for 100's, if not 1,000's of keywords for specific categories. They then offer B2B companies an opportunity to position themselves within their marketplace.

The idea is simple. Subscribe to the program, and they allow you to rank within their marketplace. You don't have to pay to be listed in their marketplace, but if you don't pay all links are removed to your brand and you're pushed down the stack. Meanwhile, you load their platform with all the keywords that matter helping them rank for your keyword. So in the end, if you don't pay, you end up competing with your content on their platform. :)

Platforms like Capterra, G2Crowd, and others have been all the rave amongst B2B players for years. It provided an opportunity to get out of the Google Adspace sinkhole, or so it is believed. But how effective are they? What kind of leads do they generate?

These are the questions this article will explore. We'll do so by performing some log analysis on the requests we track coming from the Capterra platform, both from their marketplace and PPC campaigns. This should not be confused with a scientific study as it's not statistically significant, but it does provide anecdotal data and methodology that you can leverage to analyze your success rates.

The Quality of Capterra Leads

Measuring quality is very difficult, and can be a bit subjective at times. It's made a little harder because unlike traditional B2B brands we don't employ a High Velocity Sales Model, in other words we don't use BDR's, SDR'S, AE's, etc.. and we don't use forms. Everything we do is a self-service, choose a plan and start using. You could argue that in it of itself is a bit different and could sway the results. We also have trust issues, which translates to "we don't use a lot of trackers that we don't build ourselves".

But what we lack in trust, we like to think we make up for in creativity.

Naturally, we created an internal tracker to monitor all traffic from Capterra and appended a simple attribute (i.e., ?location=capterra) to each URL referenced in the Capterra marketplace. The tracker was designed to follow the user through the workflow. Success for us was when the user created an account via a "free" trial. Here are a couple of noteworthy configurations worth noting:

Category	Description
Budget	$1,000
Time Period	2 Weeks
Geography	United States
Final Destination	https://trunc.org or https://trunc.org/pricing

This is what we used as the foundation of our methodology for differentiating between good and bad traffic. In this instance, Good was traffic we felt was "real" while Bad represented traffic we didn't think was real.

Attribute	Description
User Behavior	Every request made to a website is followed by a series of request. In almost all instances it calls some images or CSS. If something doesn't have the supporting requests it's indicative of bot traffic or automated traffic.
Device Used	Analyzing user agents becomes extremely important when trying to do attribution. Are the users using legitimate browsers and devices? How does this compare to the bigger web as a whole
Location	The campaign was tied to one geography, the US. Seeing traffic from outside of this geography is extremely suspicious.
Device Origin	Are the users coming from real devices? Think a laptop, desktop, or mobile device, or are they coming from cloud servers? How do your normal users behave?

All that was left to do was analyze the logs. Side note: we were not able to confirm the IPs with Capterra. We requested their data but they cited privacy concerns, but assured us they have advanced systems in place to parse out junk. They did provide a very interesting response which is a topic for another conversation about marketing as a whole:

"Thanks for flagging. Most users find our sites from search engines and come from all over the world. We have extensive algorithms in place to protect you from being billed for malicious or fraudulent click activity, which means you're not necessarily charged for all the traffic you receive from Capterra/Gartner Digital Markets, or for all form submissions. As with any PPC channel, the leads you generate will typically fall in to the rule of thirds where 1/3 of your leads will be junk, 1/3 will fall outside your target market, and 1/3 will be quality leads for your sales team to pursue." - The Capterra Team

This article is not about having been charged for more or less on leads. We are also not contesting what Capterra reported in their dashboard vs what we see in our logs. That is why we don't share much of the performance data from their system and focus purely on what we can see on ours. All we want to know is about the lead quality.

Speaking of which, let's see what the data tells us.

Here is a summary of the important bits of information (all derived from the web access logs):

Attribute	Description
Time Period	09/22/2022 - 10/04/2022
# of Unique IPs	121
# of IPs (Not Bot Crawlers)	76
# of IPs (Potentially Good)	16
# of IPs (No Good)	60
# of IPs (Confirmed Good)	6 (7.89%)
# of IPs (Suspect)	2 (2.63%)
# of Conversions	0
Cost / Real Click	$166.67

The table above shares a very interesting story.

First, you will notice there were 121 total unique IPs that came referencing a referrer from Capterra, 76 of them were potentially interesting traffic (~62%). We immediately dismissed 38% of the traffic as bot traffic (think Google bot). This was actually something we didn't think about, but quickly realized what was happening as we studied the origins. It also supports the response we got back from Capterra about SERPs.

We then focused on the 76 unique IPs and started to dive into their behavior specifically.

Analyzing the 76 IPs, we narrowed it down to 16 potentially good leads. As we analyzed deeper, however, we found that only 6 of those 76 (7.89%) were confirmed good, leaving room for a 2.63% improvement with two suspect real sources. We did this by analyzing each request individually and studying different facets of their behavior. We looked at what the user did on the site, where they came from, how the request was manipulated, the device and browser combination, and finally we compared it to what the industry sees.

Based on our analysis, we found that of the non-bot traffic, we received 6 potentially good leads (~7.89%). Capterra reported 51 users in the same time period, so assuming their bad traffic meter is better than ours (which we sure hope is the case) we're looking at a better rate of ~11%. Mind you, there was 0% success. Not one user created a trial, which we find extremely odd being the premise of a platform like this is that they attract prospects with "intent".

We sure more about our approach and thinking below. What else would you add to your analysis?

Analyzing the Capterra Traffic and Leads via Logs

We started our analysis by establishing a baseline for what user behavior should be.

For example, whenever someone first visits a site we should see a series of requests. For us, this is what they look like:


			[ip] - - [24/Sep/2022:00:51:11 +0000] "GET /?location=capterra&utm_source=capterra HTTP/1.1" 200 9738 "https://www.capterra.com/" "Mozilla/5.0 (Android 11; Mobile; rv:104.0) Gecko/104.0 Firefox/104.0"

			[ip] - - [24/Sep/2022:00:51:11 +0000] "GET /css/pricing.css HTTP/1.1" 200 7206 "https://trunc.org/?location=capterra&utm_source=capterra" "Mozilla/5.0 (Android 11; Mobile; rv:104.0) Gecko/104.0 Firefox/104.0"

			[ip] - - [24/Sep/2022:00:51:11 +0000] "GET /css/styles?v202209.css HTTP/1.1" 200 29191 "https://trunc.org/?location=capterra&utm_source=capterra" "Mozilla/5.0 (Android 11; Mobile; rv:104.0) Gecko/104.0 Firefox/104.0"

			[ip] - - [24/Sep/2022:00:51:12 +0000] "GET /js/scripts.js HTTP/1.1" 200 889 "https://trunc.org/?location=capterra&utm_source=capterra" "Mozilla/5.0 (Android 11; Mobile; rv:104.0) Gecko/104.0 Firefox/104.0"

			[ip] - - [24/Sep/2022:00:51:13 +0000] "GET /images/dashboard/trunc-category-selection.png HTTP/1.1" 200 29449 "https://trunc.org/?location=capterra&utm_source=capterra" "Mozilla/5.0 (Android 11; Mobile; rv:104.0) Gecko/104.0 Firefox/104.0"

			[ip] - - [24/Sep/2022:00:51:13 +0000] "GET /images/logos/trunc-logo-wide.png HTTP/1.1" 200 29333 "https://trunc.org/?location=capterra&utm_source=capterra" "Mozilla/5.0 (Android 11; Mobile; rv:104.0) Gecko/104.0 Firefox/104.0"

You can achieve something similar by making a request to your site from different browsers. You will find that almost every request as a very similar footprint.

This becomes our good behavior baseline.

A browser by design will call a series of assets when it's loading a site. If we don't see this, then we know something is off, it's not a real browser making the call (but that's not always the case either, we'll expand later).

Bot, or automated, traffic, will look something like this:


			[ip] - - [04/Oct/2022:01:56:43 +0000] "GET /css/styles?v202200926b.css HTTP/1.1" 304 5066 "https://trunc.org/?location=capterra&utm_source=capterra" "Mozilla/5.0 (Linux; Android 12; V2118) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.114 Mobile Safari/537.36"

			[ip] - - [04/Oct/2022:01:56:43 +0000] "GET /css/pricing.css?v202200926b HTTP/1.1" 200 7218 "https://trunc.org/?location=capterra&utm_source=capterra" "Mozilla/5.0 (Linux; Android 12; V2118) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.114 Mobile Safari/537.36"

			[ip] - - [04/Oct/2022:01:56:44 +0000] "GET /js/bootstrap.bundle.min.js HTTP/1.1" 200 23493 "https://trunc.org/?location=capterra&utm_source=capterra" "Mozilla/5.0 (Linux; Android 12; V2118) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.114 Mobile Safari/537.36"

In this instance, the user is not acting the way it should. If you recall, Capterra should only be linking to the home page "/" or "/pricing" in the example above, the IP is only calling assets but there are no other requests.

That is a pretty strong indicator that something is not right.

We used that as the initial base to weed out some noise. From there, we shifted our focus to a number of different things. What did they do on the site? Where were they coming from? What kind of browser were they using? How'd that compare to what the rest of the web was doing?

Here is what we found:

Attribute	Description	Notes
Not Real Browsers	14.47%	Using things like HeadlessChrome, or Android WebView
Chrome Browser	61.8%	A lot of out of date Chrome instances.
Out of Date Chrome	23.68%	Had Chrome instances going down to Chrome 83.
Android	68%	Android leads the way.
Windows	17%	Windows is second place, everything else is negligible.

Let's take a minute to digest the table above.

Something surprising to us was how big the percentage was for non-browsers, things like HeadlessChrome or Android Webview. These are development tools that allow you to pull websites, similar to CURL using your terminal, app, or command prompt. It is highly unlikely that a real user is using that and also going to Capterra to search for a product.


			Mozilla/5.0 (Linux; Android 11; SM-A032F Build/RP1A.201005.001; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/87.0.4280.141 Mobile Safari/537.36

It is also very peculiar to see the out of date browser information. Seeing Chrome lead the pack at 61% is not a surprise, but seeing 23% of Chrome being out of date is. This is usually a strong indicator of a scraper with a modified user agent in its request. Here is a great example, an Android 10 (relatively new OS version) using an extremely old Browser configuration:

Mozilla/5.0 (Linux; Android 10; M2006C3LII) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.101 Mobile Safari/537.36

It doesn't pass the sniff test.

We then spent some time looking at systems the users were using:

Attribute	Description	Notes
Cogent	15.79%	Cloud Server Provider
GLOBALTELEHOST Corp.	11.84%	Cloud Server Provider
Oracle Corporation	13.16%	Cloud Server Provider
Amazon	6.58%	Android leads the way.
Real ISP	19.74%	Service providers like T-Mobile, Comcast, AT&T, Charter, etc..
US Traffic	90.79%	Traffic originating from the US
Non-US Traffic	9.21%	Traffic originating from outside the US (e.g., China, Indonesia, India)

A user's origin is extremely interesting. For instance, from the traffic we saw 80% came from some top cloud service providers. Only 21% came from what we'd consider to be real Internet Service Provider (ISP). Granted, some might be related to an VPN service provider as well. But what was curious is that the disparity between cloud providers and ISP's was massive compared to what we see across our dozen different brands.

Coincidently, all the outside traffic (9.21%) was bad traffic, and it also didn't align with our geography thresholds (remember we were limiting to US only).

In the end, after parsing and analyzing the data the IP's we felt confident in came down to 6 and they all came from real ISPs (i.e., AT&T, Comcast, T-Mobile), and one from CloudFlare (possibly their WARP solution). The two suspect came from Cogent and CrownCloud, we believe to be tied to a VPN provider.

Are the Capterra Leads Worth the Investment?

In our case we saw either an 8% or 11% lead generation and 0% conversion, it's a short cry of the 33% we should have expected based on the team's response. So for a self-service B2B brand, at our price point, it might not make the most sense.

Yes, there are a number of things that could have contributed to the 0% conversion (i.e., landing pages, content, etc..), and there is also the element of time (will they convert later) which is why we don't focus on conversion as much as the quality of leads.

Will a lead generation platform create leads? Yes, we believe they will. But, if you're an executive be extra careful when looking at the raw data. Taking this scenario as an example. You can't look at your dashboard and assume you received 51 leads and your sales team should convert 20% of those leads, when in reality only 11% of those leads are probably good and 20% of that would have been 1. With these stats, you'd need to be spending thousands a month to improve your odds, anything less and you're just burning cash.

Where this could work very well is in an instance where your product, industry, is extremely expensive. In this instance, the CAC and success rate might work extremely well because you only really need 1 / 2 big clients to actually close. If nothing else, we hope this article helps shed light on how you can use your logs to help with marketing attribution.

Logging Guides

We love logs. In this section we will share some articles from our team to help you get better at logging.

Trunc Logging

Logging for fun and a good night of sleep.

Real time search
Google simple
Cheap
Just works
PCI compliance

Latest Articles

Latest articles from our learning center.

2025-07-22Early Scans for CVE-2025-53771 (SharePoint Vulnerability) Detected
2025-06-03Investigating the 'slince_golden' WordPress Backdoor
2025-05-30Vulnerability Scanner Logs: WPScan
2025-05-29Web Scanning, Development Hygiene, and File Exposure Risks
2025-05-29Troubleshooting Remote Syslog with TCPDUMP
2025-05-29Logging basics: Syslog protocol in detail

Contact us!

Do you have an idea for an article that is not here? See something wrong? Contact us at support@noc.org

Tired of price gouging

Clear pricing
No need to guess
Real people
Real logging