Rethinking the way I test antivirus applications
The details of how I currently test antivirus (AV) applications can be found on this page in detail and while I think this is a good methodology, I’ll be making some changes. I’m still working out the details but I’ll post my thoughts here in case you, the reader, has ideas or feedback.
System performance
The test I want to change is the performance impact, or resource usage, test. Currently a full system scan is started on a test system and several benchmark tests are performed. Each test runs several times to ensure consistent results. The results are compared to a benchmark score of a clean system and the percentage of performance loss is calculated. This is done for each product on a few different machines and takes up an enormous amount of time. While I believe this test is very accurate I also think it’s too much work for a rare scenario. Full system scans are typically done 1. When the AV is installed and updated for the first time 2. On a schedule set by the user or 3. On-demand initiated by the user. The test will show the impact an AV application has on the system in those 3 scenarios which is nice but again, these scans are pretty rare and/or happen at a time when the system is not in use.
Anyone familiar with AV knows that it can, and often will, impact system performance. So those running AV will save the full system scans for a convenient time like a lunch break, while sleeping or over the weekend. In that one case where a new AV product is installed and updated for the first time you are probably already experiencing issues which is why the product was installed in the first place, so a full system scan at that time is usually expected to take a while and slow down the system. The current test is no longer relevant. An impact in performance is known and expected during a scan so noone needs to know the exact percentages etc. So the test in it’s current form will stop. Full system scans will only be done on one or two machines to get an idea of the scan speed, one on an SSD and one on an HDD system for example.
Instead I want to focus on other areas that may impact system performance. Updating, installation of updates, on-access/real-time scanning, network and email monitoring. This should not require multiple machines, an iMac should be enough of a benchmark so that those with slower or faster systems can guesstimate what the impact on their system will be like. Even though I ran all tests on multiple machines to be as real-world as possible, if your system has an HP scanner driver running in the background hogging up half of your CPU, my test results lose a lot of value in your case, just to name one example. There are too many different configurations (hardware+OS+software) out there to know exactly how an AV will impact your particular system. Testing individual, most often used, features and simply noting resource usage at that time should give you an equally good idea.
Once a good routine is found it should be a lot easier and less time consuming to perform these tests for all products on the list, making more frequent updates a possibility too.
Detection
Having an AV product scan a folder full of malware will show how up to date their malware definitions database is and how it deals with different file types but it does not show how efficient it is when searching your entire system for malicious files. This is why I include the trace detection results in the PDF. I want to focus more on these trace results as I feel they are undervalued often. An example I’ve used before; a malicious application is found but the script hiding in the LaunchAgents folder is not. That script can re-download the malicious application next time your Mac restarts and/or receive different instructions from a command & control server. It’s a very specific example but it shows how important the trace detection is. I think the important stuff is already listed but there are more traces that need to be added.
False sense of security
This is a tricky one. I believe some of the tested AV vendors have looked at my test results (and others out there) and made sure all of the samples tested are included in their definitions database. Whether this was done to look good in the test and boast about it or as an “oh crap, we should step it up” only time will tell. When new samples are added it becomes clear right away which product is really on top of it and which just run after the testing. This creates a dilemma.
Do I remove the sample hashes from the PDF to see if AV vendors are really doing their job, finding the samples on their own and adding them to their database? This should show how good their database really is in future tests but it also takes away the possibility for others out there to replicate my test. Being able to replicate a test like this is important to prove the test and samples are legit. Also, if companies are adding just the necessary signatures to look good in the test, is it not a win/win for both the vendor and the end user? They look good in the test ánd the end user is protected, at least from all the malware listed in the test. I don’t know… If only the samples in the test are added to look good then the end user is not as protected as they can and should be. After all there are many many more samples out there, not just the ones I list. This would mean that even though the AV looks great in the test, it gives a false sense of security for the users of that product.
For example; the DNS Changer malware (2007) has over 200 known samples, my test only lists 31. The idea being that if those 31 are detected by an AV, all the others will most likely be detected as well. (Flashback (2011) has over 60, I list 20, etc). 31 samples detected is a good indication of the quality of their definitions database I thought but I’m not so sure anymore now. So do I add hundreds of samples to the list just to make sure the AV covers them all? Do I leave it as it is? Will I do the end user a favor if I do list them all (assuming AV vendors will jump on it and update their databases)? Things I have not figured out yet.
Most of these companies genuinely are in this business to protect users. I personally believe most vendors like Symantec, Intego, Avast, Sophos, F-Secure and ClamXav are some of them. They have all been around for a while and regardless of their test results (and me questioning them sometimes), I know they are in it for the right reasons and not to make a quick buck. I have my doubts with some other companies on the list and would not like to see them get first place in a test like this one using the tactics I mentioned above, the end user will be the victim in the end.
Feedback
Since I started this blog I have come in contact with a lot of amazing and smart people. Giving me feedback, suggestions, support and ideas. I’m looking to put all those big brains together and create feedback on this article to ensure the next version of testing will be much better. Keeping in mind I am not a group of people or an organization so time and resources are limited. Users and AV vendors alike, if you read this and have something to add just let me know 🙂
1 thought on “Rethinking the way I test antivirus applications”