ATR_DAEMON , Utilities, ACMS Audit Trail Logger utilities ***************************** CAUTION ******************************** This sample program has been tested using ACMS 3.1 and 3.2 on VMS 5.3 to 5.5. However, we cannot guarantee its effectiveness because of the possibility of error in transmitting or implementing it. It is meant to be used as a template for writing your own program, and it may require modification for use on your system. ***************************** CAUTION ******************************** A DAEMON FOR ACMS AUDIT TRAIL LOGGER VAX ACMS (Application Control and Management System) is part of the DECtp product offerings from Digital, and layers onto VMS to provide an Online Transaction Processing environment (OLTP). Applications are written by a process of defining Tasks by using a Task Definition Language. The Tasks of the application are broken down into a series of steps that displays forms for user interaction, and/or calls a compiled program to process information. By using ACMS, applications are built to be multi-threaded so that many users will be accessing the application, but fewer processes are actually on the system. Users do not use their own VMS processes while in ACMS, they use the ACMS processes. Users can even be defined as captive within ACMS so that there is no VMS process ever created for them at all! These captive users still go through VMS security as well as ACMS security before getting access to the system, but ACMS is the one that controls them so that no VMS user process creation is ever needed. Due to the complexity of ACMS, Digital provides a number of utilities for the purpose of managing it. These utilities provide for starting and stopping the ACMS system, starting and stopping applications, ACMS User and Terminal authorizations, as well as error reporting tools. There are two error reporting utilities, SWLUP (SoftWare event Logger Utility Program) and ACMSATR (ACMS Audit Trail Report Utility). SWLUP allows the reporting of ACMS application software errors that are logged by the SWL process of ACMS, while the ACMSATR (ATR for short) provides the audit trail of all ACMS activity. Those of you who have used ATR, are probably aware that when you ask for an ATR listing, you get a BIG listing. It becomes very laborious to wade through all the information that ATR spits out. Some of the information you will see is Application starting, Users Logging in and out, Tasks starting and ending, and Abnormal terminations of tasks and/or applications. Even on a small ACMS system, this log can grow rapidly. ATR provides a few helpful switches to diminish the size, and narrow the scope of the listing it provides. You can ask for a specific user, task, application, time, etc. or any combination thereof. There is also a /TYPE switch which allows the user to get a report based on certain types of events. Among these events is /TYPE=ERROR, which will give a report showing ACMS errors such as a server unable to complete its initialization procedure. Unfortunately, this will not show you errors such as a task cancellation, or a DECforms error, even though this information is in the log. You have to go back, and resort to doing a plain LIST, and then digging through the listing for these. Listing 1 shows an output from my program ATR_ERROR. Note that the Type code on these errors are not ERROR, therefore a /TYPE=ERROR would not list these types of problems. As an example, you receive a call from a user; this user was in a certain task when all of a sudden an error flashed at the bottom of their screen. The screen then cleared and they were back at their main menu. When you ask what the error was, they didn't get a chance to read it. When you ask for which screen they were in (This task has 9 different ones) they can't remember because this happened yesterday morning! You have now limited your scope to one user for four hours and one task, but even then, if they were in and out of this task a number of times, it is going to take time to find the error within ATR. ATR_DAEMON is my answer to the above problem. In short ATR_DAEMON lives on the system as a detached process. It wakes up periodically, based on a logical, scans the ACMS$AUDIT_LOG (the file ATR uses) for errors, and then mails these errors to a distribution list. ATR_DAEMON consists of 1 TPU and 2 DCL files. Not only does this isolate errors, it also may allow you to catch the error before the user even tries to notify you of it. A word of warning: This may lead some users to paranoia and a "Big-Brother-Is-Watching" Complex. To install ATR_DAEMON on your system, you will need to create a directory for the ATR_DAEMON files to reside. In this directory you will place the ATR_DAEMON.COM, ATR_ERROR.COM, and FIND_ATR_ERROR.TPU files. Next, place START_ATR_DAEMON.COM, and ATR_DAEMON_LOGICALS.COM in SYS$STARTUP, and then edit your SYSTARTUP_V5.COM to do an @SYS$STARTUP:START_ATR_DAEMON. This will define the ATR_DAEMON system logicals and start the daemon at system startup time. You will now need to edit the ATR_DAEMON_LOGICALS.COM file to customize the logicals to your system. Table 1 gives a breakdown of the logicals and how they are used. Once all the files are in place, and the logicals are set to your liking, to start the daemon, you can run the START_ATR_DAEMON.COM file. ATR_DAEMON.COM (program 1) is a DCL command procedure that runs as a detached process. You should run it on the node the ACMS application is on to get the task and application error messages. If you are in a distributed ACMS environment, you may want to have a daemon running on each node to get more information, like ACMS networking problems. There is a separate ACMS$AUDIT_LOG for each node in an ACMS environment. When an error occurs in ATR_DAEMON.COM, it is trapped by an ON ERROR statement. This trap will get the error message with a time stamp attached and write it into a file pointed to by the logical ATR_DAEMON_ERRORS. If it gets an error while catching an error, it will attempt to mail a message that it is dying to a distribution list, pointed to by the system logical ATR_DAEMON_ERRDIS. ATR_DAEMON.COM first sets the privs to SYSPRV and READALL. READALL is used to get by any protections that might pop up and is not really necessary, but SYSPRV is required to do the ATR LIST command. The process name is then set to give the daemon an identity so when you do a SHOW SYSTEM command, you can easily find it. Next ATR_DAEMON.COM cleans up its old output files and generates a file pointed to by ATR_DAEMON_INPUT by doing an ATR LIST with a delta time of -SLEEP_TIME (SLEEP_TIME is the translation of the logical ATR_DAEMON_SLEEP). Next, it will run the TPU procedure, FIND_ATR_ERROR.TPU, to extract the errors. After executing the TPU procedure, the daemon checks to see if an error text file, ATR_DAEMON_OUTPUT, was created. If it was created, ATR_DAEMON.COM mails the error file to the distribution list, ATR_DAEMON_DISTRIB. The daemon then goes to sleep for the length of time specified in ATR_DAEMON_SLEEP, before restarting the whole process again. FIND_ATR_ERROR.TPU (program 2) is a batch TPU procedure to extract the errors from the listing produced by ATR. The procedure looks for signs of an error (see the ERROR_PATTERN definition in the source) by looking for keywords on the TEXT line. These keywords are "Task Canceled", "Signal by", "Error", "failed", and "Unsuccessful Appl". An ATR listing separates each log entry by a row of asterisks, so the TPU procedure merely grabs all of the text found in between the two rows of asterisks when a match is made with the error_pattern. It then checks to make sure the error is not a message from JBC (Batch job successfully submitted, etc.) and if it is not, it writes this to a buffer of errors. After all of the listing is checked, if any errors were detected, it writes out the error buffer as the file pointed to by the logical ATR_DAEMON_OUTPUT, then exits back to the DCL process. START_ATR_DAEMON.COM (program 3) is a DCL command procedure to start the Daemon. This procedure could be placed in the SYS$STARTUP directory and executed at system startup time. This procedure first defines the system logicals by executing ATR_DAEMON_LOGICALS.COM (program 4), and then creates a detached process for the daemon. The username for the daemon's process will be defined by the logical ATR_DAEMON_USERNAME (SYSTEM is used if this logical is not defined). Since the ATR_DAEMON translates all logicals as they are needed, this procedure is dynamic. You do not have to stop and restart the daemon for your logical name changes to take effect. This allows the flexibility of changing the directories used, the sleep time, and the distribution lists while the daemon is still running. The only logical that is not dynamic is the ATR_DAEMON_USERNAME. You must restart the daemon for a change in this logical to take effect. As a bonus, I modified ATR_DAEMON.COM and created ATR_ERROR.COM (program 5). This is an interactive version of the daemon that allows you to specify the switches, enclosed in quotes, to pass to ATR (default is "/SINCE=TODAY") and it will execute FIND_ATR_ERROR.TPU and display the errors it finds on your terminal. Since ATR_ERROR uses the logical ATR_DAEMON_SCRATCH, ATR_DAEMON_INPUT, and ATR_DAEMON_OUTPUT, the user should redefine these logicals in his/her LOGIN.COM procedure to point to his/her own directories, rather than the daemon's directories. If the ATR_DAEMON_INPUT and ATR_DAEMON_OUTPUT use ATR_DAEMON_SCRATCH to point to the files, then the user will merely have to redefine ATR_DAEMON_SCRATCH to point to their area. Since the ATR_DAEMON_OUTPUT file will not be deleted until you run ATR_ERROR again, you can use it for doing quick scans of a previous day's errors, creating daily logs of errors for status reports, and you can edit the error file down so you can MAIL out the errors to the appropriate people for prompt action. These utilities have been a great help in catching errors, both reported and unreported by users. As it is currently written, ATR_DAEMON can stay active 24 hours a day, even if there are no applications running, and ACMS is stopped. If you are currently working in an ACMS application environment, and you have never used the ACMS Audit Trail Logger, I challenge you right now to pull out the manual on this utility and begin getting to know it. As an ACMS System Manager, ATR can really help you out when you need information about your ACMS system and what all is going on in it.