Log analysis is the science of making sense of your raw logs. It is the process of reviewing, interpreting, and understanding system-generated records called logs. Trunc works to build on this, contextualizing and correlating raw data to create insights for security administrators.
One of the biggest challenges we face in the modern DevSecOps world is the mountains of data we are responsible for managing and making sense of. One of the most important bits of data are logs. These logs can be noisy, and yet we are tasked with reviewing, deriving intelligence from them, and storing them in such a way that we can retrieve them if necessary. Making sense of the noise can be an excruciating task even for the capable DevSecOp teams, and that is where log analysis comes into the picture.
Logs are files used to record events as they occur. Every modern system we use makes use of a log.
Logs exist as a record that helps incident responders better understand what happened because in the world of security it is not if you've been hacked, but a matter of when. With enough time and motivation, attackers are often able to defeat most defensive measures. Maintaining the integrity of your logs is paramount to the success of any incident review.
Log analysis is the science of making sense of raw logs. It is the process of reviewing, interpreting, and understanding logs such that we can make good decisions for our respective organizations.
Log analysis helps us make sense of the noise. A good log analysis process will generate useful metrics that paint a clear picture of what has happened across your organization. This data can be used to improve or solve performance issues within an application or infrastructure. Looking at the bigger picture, companies analyze logs to proactively, and reactively, mitigate risks, comply with security policies, audits, and regulations, and understand online user behavior.
There are 5 key benefits to log analysis:
|Compliance||Many governmental or regulatory bodies require organizations to demonstrate their compliance with the myriad of regulations that impact nearly every entity. Log file analysis can demonstrate that HIPAA, PCI, GDPR, or other regulation mandates are being met by the organization (e.g., A List of Compliance Programs by Industry|
|Security||Log analysis provides powerful tools for taking proactive measures and enables forensic examinations after the fact if a breach or data loss does occur. Log analysis can utilize network monitoring data to uncover unauthorized access attempts and ensure security operations and firewalls are optimally configured.|
|System Availability||Timely action that occurs based on information uncovered by log analysis can prevent an issue from causing downtime. This in turn can help ensure that the organization meets its business goals and that the IT organization meets its commitments to provide services with a given uptime guarantee.|
|Infrastructure Provisioning||While organizations must plan to meet peak demands, log analysis can help project whether there is sufficient CPU, memory, disk, and network bandwidth to meet current demands – and projected trends. Overprovisioning wastes precious IT dollars and under-provisioning can lead to service outages as organizations scramble to either purchase additional resources or utilize cloud resources to meet flexes in demand.|
|Sales and Marketing Attribution||By tracking metrics such as traffic volume and the pages that customers visit, log analysis can help sales and marketing professionals understand what programs are effective, and what should be changed. Traffic patterns can also help with retooling an organization’s website to make it easier for users to navigate to the most frequently accessed information. (Example - Do Lead Geneation platforms work?|
Logs provide visibility into the health and performance of an application and infrastructure stack, enabling developer teams and system administrators to easily diagnose and rectify issues. Here is a basic five-step process for managing logs with log analysis software:
|Instrument and collect||Install a collector to collect data from any part of your stack. Log files may be streamed to a log collector through an active network, or they may be stored in files for later review.|
|Centralize and index||Integrate data from all log sources into a centralized platform to streamline the search and analysis process. Indexing makes logs searchable, so security and IT personnel can quickly find the information they need.|
|Search and analyze||Analysis techniques such as pattern recognition, normalization, tagging, and correlation analysis can be implemented either manually or using native machine learning.|
|Active Monitoring and Alerting||With machine learning and analytics, IT organizations can implement real-time, automated log monitoring that generates alerts when certain conditions are met. Automation can enable the continuous monitoring of large volumes of logs that cover a variety of systems and applications.|
|Reporting||Streamlined reports and dashboarding are key features of log analysis software. Customized reusable dashboards can also be used to ensure that access to confidential security logs and metrics is provided to employees on a need-to-know basis.|
The framework above, however, doesn't address one very important problem - data from multiple sources. While most system and applications collect logs, they are not all standardized in how it is recorded. This means that if you want to make sense of it, you have to establish a foundation from which you can work. Depending on what you read, these might be called functions or methods used when aggregating your logs. Here are the four steps you should consider in addition to the framework above:
|Normalization||Normalization is a data management technique wherein parts of a message are converted to the same format. The process of centralizing and indexing log data should include a normalization step where attributes from log entries across applications are standardized and expressed in the same format.|
|Pattern Recognition||Machine learning applications can now be implemented with log analysis software to compare incoming messages with a pattern book and distinguish between "interesting" and "uninteresting" log messages. Such a system might discard routine log entries, but send an alert when an abnormal entry is detected.|
|Classification and Tagging||Group together log entries that are of the same type. We may want to track all of the errors of a certain type across applications, or we may want to filter the data in different ways.|
|Correlation Analysis||When an event happens, it is likely to be reflected in logs from several different sources. Correlation analysis is the analytical process of gathering log information from a variety of systems and discovering the log entries from each system that connects to the known event.|
Not all logs are helpful. Although extremely important, they are often an afterthought to most developers which nets us very bad logs. It's made worse by the complexities introduced by other SIEM and Log Management platforms. At Trunc we work to make sense of the noise out-of-the-box so that the system is immediately working for you upon receiving the first log event.
We create parsers that help us contextualize the logs, making them helpful to an administrator while removing the logs that are not.
Here is a simple example using SSHD. If you were looking at an SSHD log, this is what you would see in your log file:
Accepted password for john from 149.1.x.x port 23414
SSHD is an example of an application that records clean logs. That being said, when you have a lot of users, this would get lost in the noise and you also lack any other information on whether this is good or bad (outside of the fact that John looked in with a password).
Log analysis would be the process of extracting as much information from this log entry as possible to help make a determination if this is good or bad. As it stands, that is impossible. Especially when compounded by the fact that this is one entry of thousands on one server.
In a real-world example you're talking N number of servers, N number of entries, N number of logs, where N is an infinite number. It's a near impossible task for an individual, or team, in most instances.
Let's take a look at what it means to contextualize the log, analyzing it to derive more intelligence, in an effort to make a determination if this is good or bad.
This is what Trunc sees with the same log from SSHD:
Login success via SSHD.
IP: 149.1.x.x (Germany)
149.1.x.x: Tor exit node.
149.1.x.x: Flagged in multiple blacklists.
That's interesting. So John SSH'd into a server, using a Tor exit node, and that exit node has been blacklisted for malicious activity across other networks. That node also happened to be in Germany, but John lives in California.
In the world of security, this would, should, be a big red flag. But in many instances, it falls through the cracks because most organization lack a mechanism to effectively parse, or make sense, of their logs.
At Trunc we are on a mission to simplify this process. By default, all logs will be categorized against our rules and we have made it super easy to quickly sift through your logs according to those categories. For example, an administrator would be able to do a search like this right from their dashboard:
category:authentication_success AND category:tor_connection
We turn logs into actionable intelligence. Everything can be searched through our Google-like dashboard, making it simple to investigate the incident and conform to multiple compliance requirements (PCI-DSS, GDPR, SOC-2). The best part is you can parse through all your logs, not just one log file on one server, but all of them from all locations at the same time.