Whatever tool you use, good log messages can be invaluable during development and save developers time. I am guessing fewer of you think about logs once your App is in production. Errors occur, users create tickets, and bugs get fixed. I believe that we can do better. Everyone wins if we action warning and error messages and fix issues before users find them. Users do not have to deal with bugs, and developers don't have to deal with production support issues at inopportune times.
In this post, I will discuss three topics related to making the information in your logs actionable 🚀
- Good Log Message Principles
- Timely Notification of Errors
- Make Decisions with your Logs
Good Log Message Principles
This list represents a set of best practices I use when it comes to logging:
- The log level
errorshould mean something. Understand when and where to use each type of log message
Informationlevel messages to inform you what features are being used
Warninglevel messages for exceptions that are handled by your code but could cause issues if they are allowed to persist. The user may not be aware that the issue occurred, but it may still indicate a problem. Warning messages should be actioned but may not need to be addressed immediately. Examples include missing configurations, repeated attempts to use an expired token, failed login attempts, certain validation failures occur that you would not expect to happen, etc.
Errorlevel messages for errors that require human intervention to resolve. This could be PL/SQL errors, an unhandled exception, an error calling a web service, a critical configuration missing, a scheduled job failing, etc.
Errorlevel logs should be reserved for issues that need to be addressed urgently
- Don't spam your logs with
Warninglevel messages that are not actionable. If you see these kinds of logs, either fix them or change them to
Debuglevel logging off in production but make sure you can turn it on when necessary. Even better have the ability to switch debug level logging on for a specific user or session
- Write meaningful log messages that describe in English what is going on at that point in your code. You can even use well written debug level messages instead of comments
- Make sure you capture as much detail as possible about the session state, who the user was, and what they were doing during a Warning or Error event. You may be looking at an error twenty-four hours after the fact
- Avoid adding log messages in loops that have thousands of iterations. Too much logging can impact performance adversely
- Make sure you log context so you can quickly identify the APEX page, PL/SQL procedure, REST Service, etc. that generated the log message
- Make sure you have a routine to purge logs after some time. Automated log purging is taken care of for you with APEX and Logger logs
- Don't log sensitive information like passwords, API keys, etc.
- Pick a logging tool and stick to it
- Educate your team on the tool you do use and provide best practices and examples
- Don't forget to add log messages to any PL/SQL-based ORDS services
- If you are using APEX, don't forget to create an Error Handling Function so that you can trap APEX errors and create log messages using the logging tool of your choice
Timely Notification of Errors
Logs are most valuable when Warnings and Errors can be communicated and actioned promptly. This allows developers to be proactive in resolving issues. You may even be able to fix bugs before users notice there is an issue.
As an IT professional, you look much better if you can respond to a bug report with "yes, we noticed that issue, and we are working on a fix now", rather than "oh crap, I better see if this bar has wi-fi 🍺".
To achieve this, you must notify support teams as soon as possible after errors occur. With the introduction of APEX Automations, we are better placed than ever to build scheduled database jobs to do this.
An automated approach to error log notification could look something like this:
- An APEX Automation runs on a schedule
- The APEX Automation calls a PL/SQL procedure which checks for new warning and error log messages since the last time it ran
- The PL/SQL procedure generates an HTML document listing the warning and error messages and emails the support team using APEX_MAIL
- For bonus points, you could even send
Errorlog messages to a Teams Channel or via SMS
❗There is not much point in sending emails in the middle of the night. Whatever notification method you use should be appropriate for when your support team is available and how they receive urgent messages. If you are a 24/7 shop, you will likely be integrating with something like PagerDuty.
📖 You should also build a daily digest email that summarizes all warnings and errors from the previous day. A digest email can help ensure errors do not fall through the cracks and help identify trends. A digest email may even be all you need for smaller non-24/7 environments.
Action the Logs
I can't stress enough the importance of acting on warning and error messages.
If the same message keeps coming up on the alert email, then either fix it or change the level of the message to something other than
Warning. If it stays on the Alert email daily, people will lose faith in the urgency of the alert.
Capture All the Logs
Even if you are using something like Logger, you should check other tables where APEX could be logging warnings and errors. These include:
- APEX Debug Logs
- APEX Failed Login Attempts
- APEX Automation Logs
- REST Data Synchronization Logs
- REST Web Service Activity Logs
- Logger (if you are using it)
Make Decisions with your Logs
Logs can be beneficial in compiling actionable management information. Here is a list of potential metrics (and the tables they can be generated from), along with some actionable insights that can be derived from them.
|Number of page views by application ||Which applications should we be focusing support and development dollars on?|
|Percentage of pages in each application that are used ||Identify pages and features that are not being used. Provide leverage to push back on unreasonable features. Retiring unused features and apps reduces support costs and the time to test APEX upgrades.|
|Users who received the most errors||Reach out to users; are they doing something different, do they need training, do changes need to be made to the application?|
|The top ten most used APEX pages across all applications ||What features are people using, and what should we be doing more of?|
|Average APEX page load time by application ||Is performance remaining steady over time? What performance are end users experiencing?|
|Top ten worst performing APEX pages ||Which APEX pages do we need to tune? Focus on the 'Weighted Average' (total page views * average elapsed time). If a page takes two seconds and runs once a month, that may be OK, but if a page takes two seconds and runs ten times a second, then you need to take action|
|Error and Warning log message count by Application/Page ||Are there changes that can be made to improve user experience? Is there testing or QA issues? What validations are users tripping up on, can we default data instead, etc.?|
|Automation Execution Elapsed Times ||Identify slow-running Automations and tune them|
|Outbound Web Service Calls ||From a security perspective, it is important to know all of the external web services that are being accessed by your system|
💡All of these metrics can be generated using SQL. I recommend generating these metrics Quarterly and presenting them to senior management. Regular reporting of these metrics facilitates discussion, reinforces APEX's importance to the organization, and helps identify trends.
Logging is a fundamental part of the development and support lifecycle. With some extra thought, you can make your logs actionable and use them to drive management decisions. I hope this post inspires you to look at how you do logging in to your organization.