Does Capterra work for lead Generation?

How Log Analysis Can Be Used for Marketing Attribution

Trunc is a B2B brand which means that when it comes to marketing we technically have a few tools that should pay dividends. One category of these tools is lead generation platforms. These solutions are designed to create marketplaces, which in theory should be ripe with users interested in your solution. But how effective are they?

We recently decided to test one, Capterra. Capterra is a very well-known software marketplace specifically designed for B2B brands. If you do a simple Google Search for "best log management tools" they rank #8 in the organic results. They do this for 100's, if not 1,000's of keywords for specific categories. They then offer B2B companies an opportunity to position themselves within their marketplace.

The idea is simple. Subscribe to the program, and they allow you to rank within their marketplace. You don't have to pay to be listed in their marketplace, but if you don't pay all links are removed to your brand and you're pushed down the stack. Meanwhile, you load their platform with all the keywords that matter helping them rank for your keyword. So in the end, if you don't pay, you end up competing with your content on their platform. :)

Platforms like Capterra, G2Crowd, and others have been all the rave amongst B2B players for years. It provided an opportunity to get out of the Google Adspace sinkhole, or so it is believed. But how effective are they? What kind of leads do they generate?

These are the questions this article will explore. We'll do so by performing some log analysis on the requests we track coming from the Capterra platform, both from their marketplace and PPC campaigns. This should not be confused with a scientific study as it's not statistically significant, but it does provide anecdotal data and methodology that you can leverage to analyze your success rates.

The Quality of Capterra Leads

Measuring quality is very difficult, and can be a bit subjective at times. It's made a little harder because unlike traditional B2B brands we don't employ a High Velocity Sales Model, in other words we don't use BDR's, SDR'S, AE's, etc.. and we don't use forms. Everything we do is a self-service, choose a plan and start using. You could argue that in it of itself is a bit different and could sway the results. We also have trust issues, which translates to "we don't use a lot of trackers that we don't build ourselves".

But what we lack in trust, we like to think we make up for in creativity.

Naturally, we created an internal tracker to monitor all traffic from Capterra and appended a simple attribute (i.e., ?location=capterra) to each URL referenced in the Capterra marketplace. The tracker was designed to follow the user through the workflow. Success for us was when the user created an account via a "free" trial. Here are a couple of noteworthy configurations worth noting:

Category Description
Budget $1,000
Time Period 2 Weeks
Geography United States
Final Destination or

This is what we used as the foundation of our methodology for differentiating between good and bad traffic. In this instance, Good was traffic we felt was "real" while Bad represented traffic we didn't think was real.

Attribute Description
User Behavior Every request made to a website is followed by a series of request. In almost all instances it calls some images or CSS. If something doesn't have the supporting requests it's indicative of bot traffic or automated traffic.
Device Used Analyzing user agents becomes extremely important when trying to do attribution. Are the users using legitimate browsers and devices? How does this compare to the bigger web as a whole
Location The campaign was tied to one geography, the US. Seeing traffic from outside of this geography is extremely suspicious.
Device Origin Are the users coming from real devices? Think a laptop, desktop, or mobile device, or are they coming from cloud servers? How do your normal users behave?

All that was left to do was analyze the logs. Side note: we were not able to confirm the IPs with Capterra. We requested their data but they cited privacy concerns, but assured us they have advanced systems in place to parse out junk. They did provide a very interesting response which is a topic for another conversation about marketing as a whole:

"Thanks for flagging. Most users find our sites from search engines and come from all over the world. We have extensive algorithms in place to protect you from being billed for malicious or fraudulent click activity, which means you're not necessarily charged for all the traffic you receive from Capterra/Gartner Digital Markets, or for all form submissions. As with any PPC channel, the leads you generate will typically fall in to the rule of thirds where 1/3 of your leads will be junk, 1/3 will fall outside your target market, and 1/3 will be quality leads for your sales team to pursue." - The Capterra Team

This article is not about having been charged for more or less on leads. We are also not contesting what Capterra reported in their dashboard vs what we see in our logs. That is why we don't share much of the performance data from their system and focus purely on what we can see on ours. All we want to know is about the lead quality.

Speaking of which, let's see what the data tells us.

Here is a summary of the important bits of information (all derived from the web access logs):

Attribute Description
Time Period 09/22/2022 - 10/04/2022
# of Unique IPs 121
# of IPs (Not Bot Crawlers) 76
# of IPs (Potentially Good) 16
# of IPs (No Good) 60
# of IPs (Confirmed Good) 6 (7.89%)
# of IPs (Suspect) 2 (2.63%)
# of Conversions 0
Cost / Real Click $166.67

The table above shares a very interesting story.

First, you will notice there were 121 total unique IPs that came referencing a referrer from Capterra, 76 of them were potentially interesting traffic (~62%). We immediately dismissed 38% of the traffic as bot traffic (think Google bot). This was actually something we didn't think about, but quickly realized what was happening as we studied the origins. It also supports the response we got back from Capterra about SERPs.

We then focused on the 76 unique IPs and started to dive into their behavior specifically.

Analyzing the 76 IPs, we narrowed it down to 16 potentially good leads. As we analyzd deeper, however, we found that only 6 of those 76 (7.89%) were confirmed good, leaving room for a 2.63% improvement with two suspect real sources. We did this by analyzing each request individually and studying different facets of their behavior. We looked at what the user did on the site, where they came from, how the request was manipulated, the device and browser combination, and finally we compared it to what the industry sees.

Based on our analysis, we found that of the non-bot traffic, we received 6 potentially good leads (~7.89%). Capterra reported 51 users in the same time period, so assuming their bad traffic meter is better than ours (which we sure hope is the case) we're looking at a better rate of ~11%. Mind you, there was 0% success. Not one user created a trial, which we find extremely odd being the premise of a platform like this is that they attract prospects with "intent".

We sure more about our approach and thinking below. What else would you add to your analaysis?

Analyzing the Capterra Traffic and Leads via Logs

We started our analysis by establishing a baseline for what user behavior should be.

For example, whenever someone first visits a site we should see a series of requests. For us, this is what they look like:

[ip] - - [24/Sep/2022:00:51:11 +0000] "GET /?location=capterra&utm_source=capterra HTTP/1.1" 200 9738 "" "Mozilla/5.0 (Android 11; Mobile; rv:104.0) Gecko/104.0 Firefox/104.0"
[ip] - - [24/Sep/2022:00:51:11 +0000] "GET /css/pricing.css HTTP/1.1" 200 7206 "" "Mozilla/5.0 (Android 11; Mobile; rv:104.0) Gecko/104.0 Firefox/104.0"
[ip] - - [24/Sep/2022:00:51:11 +0000] "GET /css/styles?v202209.css HTTP/1.1" 200 29191 "" "Mozilla/5.0 (Android 11; Mobile; rv:104.0) Gecko/104.0 Firefox/104.0"
[ip] - - [24/Sep/2022:00:51:12 +0000] "GET /js/scripts.js HTTP/1.1" 200 889 "" "Mozilla/5.0 (Android 11; Mobile; rv:104.0) Gecko/104.0 Firefox/104.0"
[ip] - - [24/Sep/2022:00:51:13 +0000] "GET /images/dashboard/trunc-category-selection.png HTTP/1.1" 200 29449 "" "Mozilla/5.0 (Android 11; Mobile; rv:104.0) Gecko/104.0 Firefox/104.0"
[ip] - - [24/Sep/2022:00:51:13 +0000] "GET /images/logos/trunc-logo-wide.png HTTP/1.1" 200 29333 "" "Mozilla/5.0 (Android 11; Mobile; rv:104.0) Gecko/104.0 Firefox/104.0"

You can achieve something similar by making a request to your site from different browsers. You will find that almost every request as a very similar footprint.

This becomes our good behavior baseline.

A browser by design will call a series of assets when its loading a site. If we don't see this, then we know something is off, it's not a real browser making the call (but that's not always the case either, we'll expand later).

Bot, or automated, traffic, will look something like this:

[ip] - - [04/Oct/2022:01:56:43 +0000] "GET /css/styles?v202200926b.css HTTP/1.1" 304 5066 "" "Mozilla/5.0 (Linux; Android 12; V2118) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.114 Mobile Safari/537.36"
[ip] - - [04/Oct/2022:01:56:43 +0000] "GET /css/pricing.css?v202200926b HTTP/1.1" 200 7218 "" "Mozilla/5.0 (Linux; Android 12; V2118) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.114 Mobile Safari/537.36"
[ip] - - [04/Oct/2022:01:56:44 +0000] "GET /js/bootstrap.bundle.min.js HTTP/1.1" 200 23493 "" "Mozilla/5.0 (Linux; Android 12; V2118) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.114 Mobile Safari/537.36"

In this instance, the user is not acting the way it should. If you recall, Capterra should only be linking to the home page "/" or "/pricing" in the example above, the IP is only calling assets but there are no other requests.

That is a pretty strong indicator that something is not right.

We used that as the initial base to weed out some noise. From there, we shifted our focus to a number of different things. What did they do on the site? Where were they coming from? What kind of browser were they using? How'd that compare to what the rest of the web was doing?

Here is what we found:

Attribute Description Notes
Not Real Browsers 14.47% Using things like HeadlessChrome, or Android WebView
Chrome Browser 61.8% A lot of out of date Chrome instances.
Out of Date Chrome 23.68% Had Chrome instances going down to Chrome 83.
Android 68% Android leads the way.
Windows 17% Windows is second place, everything else is neglible.

Let's take a minute to digest the table above.

Something surprising to us was how big the percentage was for non-browsers, things like HeadlessChrome or Android Webview. These are development tools that allow you to pull websites, similar to CURL using your terminal, app, or command prompt. It is highly unlikely that a real user is using that and also going to Capterra to search for a product.

Mozilla/5.0 (Linux; Android 11; SM-A032F Build/RP1A.201005.001; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/87.0.4280.141 Mobile Safari/537.36

It is also very peculiar to see the out of date browser information. Seeing Chrome lead the pack at 61% is not a surprise, but seeing 23% of Chrome being out of date is. This is usually a strong indicator of a scraper with a modified user agent in it's request. Here is a great example, an Android 10 (relatively new OS version) using an extremely old Browser configuration:

Mozilla/5.0 (Linux; Android 10; M2006C3LII) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.101 Mobile Safari/537.36

It doesn't pass the sniff test.

We then spent some time looking at systems the users were using:

Attribute Description Notes
Cogent 15.79% Cloud Server Provider
GLOBALTELEHOST Corp. 11.84% Cloud Server Provider
Oracle Corporation 13.16% Cloud Server Provider
Amazon 6.58% Android leads the way.
Real ISP 19.74% Service providers like T-Mobile, Comcast, AT&T, Charter, etc..
US Traffic 90.79% Traffic originating from the US
Non-US Traffic 9.21% Traffic originating from outside the US (e.g., China, Indonesia, India)

A users origin is extremely interesting. For instance, from the traffic we saw 80% came from some top cloud service providers. Only 21% came from what we'd consider to be real Internet Service Provider (ISP). Granted, some might be related to an VPN service provider as well. But what was curious is that the disparity between cloud providers and ISP's was massive compared to what we see across our dozen different brands.

Coincidently, all the outside traffic (9.21%) was bad traffic, and it also didn't align with our geography thresholds (remember we were limiting to US only).

In the end, after parsing and analyzing the data the IP's we felt confident in came down to 6 and they all came from real ISPs (i.e., AT&T, Comcast, T-Mobile), and one from CloudFlare (possibly their WARP solution). The two suspect came from Cogent and CrownCloud, we believe to be tied to a VPN provider.

Are the Capterra Leads Worth the Investment?

In our case we saw either an 8% or 11% lead generation and 0% conversion, its a short cry of the 33% we should have expected based on the teams response. So for a self-service B2B brand, at our price point, it might not make the most sense.

Yes, there are a number of things that could have contributed to the 0% conversion (i.e., landing pages, content, etc..), and there is also the element of time (will they convert later) which is why we don't focus on conversion as much as the quality of leads.

Will a lead generation platform create leads? Yes, we believe they will. But, if you're an executive be extra careful when looking at the raw data. Taking this scenario as an exmaple. You can't look at your dashboard and assume you received 51 leads and your sales team should convert 20% of those leads, when in reality only 11% of those leads are probably good and 20% of that would have been 1. With these stats, you'd need to be spending thousands a month to improve your odds, anything less and you're just burning cash.

Where this could work very well is in an instance where your product, industry, is extremely expensive. In this instance, the CAC and success rate might work extremely well becuase you only really need 1 / 2 big clients to actually close. If nothing else, we hope this article helps shed light on how you can use your logs to help with marketing attribution.

Posted in   log-analysis     by Tony Perez (@perezbox)

Simple, affordable, log management and analysis.