Technologies such as Elastic Stack are marking an inflection point on how fast and cheap we can extract meaningful information from application logs.
Is your application ready to go into production? Depending on your position and role in the company, this question might have several answers. They can range from “all stories are closed and the whole application was tested end to end” to “it satisfies our current needs, it is scalable, maintainable, and it was proved to work 24/7 without crashing”.
Even though these assumptions are totally true, they are not enough. Without being pessimistic, we all know that eventually the application will crash, a server will go down, users may complain about a bug that “randomly” appears, or even worse, the client could realize that some data is missing, and there is no clue when or why this data got lost. This is real life, and we must be ready for these problems.
In this way, having a good log and monitoring infrastructure becomes a key feature that allows sysadmins, support teams, and even developers to be more prepared to face these possible problems. And because this is not trivial, from the very beginning this must be a topic that every Project Leader and Product Owner has to keep in mind.
It is particularly common that Product Owners tend not to consider it since they focus mainly on offering features to the end users and improving their ROI. Thus, it is key for any Project Leader to let Product Owners understand the importance of logs because they must agree on it and be willing to assign enough resources throughout the development lifecycle to achieve this. In other words, logs must be designed, implemented, and tested.
Architects must define what, how, and when to log and even how to extract meaningful data from logs. Moreover, they must keep a sharp eye during reviews in order to ensure that the code satisfies the definitions. Developers and everyone involved in the project must not forget to send log messages at least in every layer of the architecture, and testers, for their part, must test that what has been logged is meaningful and provides enough contextual information.
What to log
It is not enough to log errors in order to use them for troubleshooting purposes. It is also useful to log successful requests so we can have a clear idea of how users work with the application.
But this goes beyond that. Logs are also useful to detect common mistakes users make, as well as for security purposes. Writing good logs about a user’s activity can alert us about malicious activity.
It is important that logs can provide accurate context about what the user was doing when a specific error happened. In this way, having at least one log entry per request/result in every layer of the application is mandatory. This is not trivial, and we have to take performance issues into account.
In order to address them, Logging Frameworks allow to specify log levels so that the application can be easily configured to just log messages that are marked as the specified log level or with lower level. The main cons about this is that when a problem comes up, you are likely to be short of contextual data in your logs that can be analyzed.
Last but not least, we need to take into account the logs provided by 3rd party applications such as web servers, databases, etc. How can we have all these logs in a fast, easy, and meaningful way?
How to log
It is key to have a good definition about the structure of a log entry. How are we going to log successful requests? How are we going to log errors? Are we going to show the complete stacktrace?
All these questions must have a well thought out answer since they all directly impact how we are going to parse logs to find meaningful information. In particular, we are of the view that stacktraces should be logged only for unmanaged exceptions that are thrown. Otherwise, we might end up logging more information than needed.
How to extract meaningful data from logs
Ok, now we have 100GB of logging data. Without mining tools this amount of data can be meaningless. This turns into a bigger issue when we want to consider not only logs from our app but also from other 3rd party applications and servers. Each of them formats and stores logs in their own way, making it really difficult to find useful data.
As you can see, all these tasks are difficult to perform if they are not carefully thought through and planned. Nowadays, in Hexacta, we are starting to use a really promising stack of applications which are the latest trend in log management solutions: Elastic Stack.
This set of applications has the potential of resolving all the problems we mentioned above by gathering all log information spread among totally different data sources, transforming and merging everything together into one single database, and applying search analytics on top of it.
Elastic Stack, the new kid in the block
To put it in a nutshell, Elastic Stack is a set of applications capable of gathering log messages from several data sources and transforming them into a meaningful format so that it can be indexed and searched, and finally displaying interactive reports in order to visually analyze data, summarize information, etc.
It is composed of 3 main open source applications:
- It ingests data from a multitude of sources: files, syslogs, SQL Queries, HTTP Requests, and more. It supports a wide variety of inputs that pull in events from common sources all at the same time.
- It filters, parses, and transforms data in order to derive structure from unstructured data regardless of format or complexity.
- It outputs the data to Elastic Search, alerting and monitoring systems, and many other targets.
- It is a distributed, RESTful search and analytics engine that centrally stores the data.
- It can scale from one single node up to hundreds of nodes in a transparent way.
- It also provides the clients with several languages like .Net, Java, and Python.
- Allows you to visualize the ElasticSearch data.
- It is also the extensible user interface for configuring and managing all aspects of the Elastic Stack.
- It provides interactive visualizations, allowing you to shape your data in several ways: histograms, line graphs, pie charts, and sunburst, among many others.
- It allows you to perform time series analyses and graph explorations.
- It gives you access to all the visualizations from all the people you consider appropriate.
To Sum Up
Having a good logging infrastructure is vital in order to let sysadmins know about the state of every component of the whole system infrastructure. They not only provide valuable information when problems arise, but they can also be used to prevent possible drawbacks or threats.
The effort required to have a meaningful set of logs fast enough to find valuable data about errors, threats or inconveniences is so overwhelming that several organizations tend not to give them the importance they deserve.
With the arrival of Elastic Stack, this scenario seems to have changed; obtaining, transforming, and analyzing logs has become easier than ever.
Questions? Comments? Concerns? Contact us for more information. We’ll quickly get back to you with the information you need.