Data Driven Security - all about the analytics

I've been remiss in my blogging duties. I've had some changes in my life recently, but I'd like to get back to posting on a regular basis and there's not real a good reason why I should be able to do that. Allow me to rectify my absentmindedness by talking about the book Data-Driven Security by Jay Jacobs and Bob Rudis.

This was a wonderful book to read as an information security professional. As information security matures (and the world in general) metrics and analytics are going to become a bigger part of the field. We see sabermetrics taking over baseball and other sports for the simple fact that it helps organizations gain a deeper understanding of what the have, which leads to making better decisions. Those same strategies can help many professional fields, including information security.

Each chapter of the book covers a different scenario in which data is analyzed to answer an infosec related question. It also discusses the art of visualization and how to make communicating numbers more useful to people (*cough*executives*cough*). The book exposes the reader to the wonderful world of Python and R studio, both of which are used to analyze and make sense of the data, without requiring too much previous knowledge. Each chapter walks the reader through exercises utilizing pre-built Python scrips in R Studio, just enough to wet the petite.

What I really enjoyed about the book was that it was easy to read. It wasn't bogged down with numbers or big words. Of course, I'm not exactly a newb to reading about statistical analysis. Still, I think people with some interest in data-driven security will find the book a fairly easy read. It's a great starting point for those wanting to explore a discipline in security that is likely to become more and more relevant as security and data matures.

Verizon Data Breach Investigation Report impressions

This is the first year I've read the full Verizon Data Breach Investigation Report. It was quite entertaining, but then again I'm into baseball and within baseball I'm into statistics. The report was easy to read, interesting, and informative and here are my impressions of the 70 page-ish report:

Threat Intelligence

Sharing threat intelligence is useful, but the strategy needs to be more, "going to the well" than "drinking from the hose." Think of the NSA's collection of information, which has been found to largely be ineffective at discovering attacks.


Communications, legal, and customer service departments were all more likely to open a phishing email. There is no easy solution or magic wand that can make phishing go away. We need to focus on better filtering, developing and executing an ENGAGING and THOROUGH security awareness program, and improve detection and response capabilities.


It's more effective to focus on getting a patch deployment strategy put in place, than trying patching systems as soon as a new patch is in place. Ten CVEs account for almost 97% of exploits observed in 2014. The ten:

  1. CVE-2002-0012 - SNMP
  2. CVE-2002-0013 - SNMP
  3. CVE-1999-0517 - SNMP
  4. CVE-2001-0540 - Memory leak
  5. CVE-2014-3566 - POODLE
  6. CVE-2012-0152 - RDP
  7. CVE-2001-0680 - Directory traversal
  8. CVE-2002-1054 - Directory traversal
  9. CVE-2002-1931 - XSS
  10. CVE-2002-1932 - Log deletion

According to this list, there is still a lot of vulnerabilities from the past that need to be patched. Getting a patching process in place is great for all the new stuff, but don't forget about all the old stuff that came out before the security team was in place.


".03% of smartphones per week were getting owned by "high-grade" malicious code."

Android is the worst operating system (everyone saw that one coming) and, "most of the malware is adnoyance-ware and similar resource-wasting infections." This might change in the future, but for now it's not a huge area of concern.


My favorite line came from this section, "Special snowflakes fall on every backyard," which is in relation to "new" malware getting around anti-virus as being described as "advanced" or "targeted." Not the case according to the report. Malware is being given unique hashes to avoid detection by anti-virus.

Industry profiles

Each organization is unique, which is not earth shattering, but good to understand when looking at internal and external entities.


There is some supply and demand with data breaches: the higher the amount of records lost; the lower the cost of each record. Keep in mind records only tell half the story when it comes to the impact of a breach. There is fallout, not only within the company but outside it.

Incident classification patterns

96% of data breaches fall into nine basic pattersn:

  1. POS Intrusions - 28.5%
  2. Crimeware - 18.8%
  3. Cyber-Espionage - 18%
  4. Insider Misuse - 10.6%
  5. Web App Attacks - 9.4%
  6. Miscellaneous Errors - 8.1%
  7. Physical Theft/Loss - 3.3%
  8. Payment Card Skimmers - 3.1%
  9. Denial of Service - .1%

These are all from the first half of the report. The other half of the report went into discussing each time of data breach and what we can learn. I highly recommend reading the whole report. Not only is it an easy read, but it gives great insight into the current landscape of breaches