SnTrace

From Sense/Net Wiki
Jump to: navigation, search
  •  
  •  
  •  
  •  
  • 100%
  • 6.5.3
  • Enterprise
  • Community
  • Planned

Overview

SnTrace
Every advanced software needs a tool that provides information about the state of the application and what is happening inside. Another important maintenance capability is to measure performance, e.g. the duration of key operations. In Sense/Net ECMS the new SnTrace component is a base of an easy-to-use tracing tool and measuring instrument in one.

For details about logging in general please visit the main Logging article.

Details

First of all, SnTrace is a quick verbose logger. This feature is different from event logging such as writing errors and warnings or audit events. Verbose logging needs to be as fast as possible regardless of whether the feature is turned on or off. It is designed to be able to handle even a huge load of concurrent events and is usually used to monitor the behavior of the system when there are lots of things going on. For example tracing is reponsible for marking all important points of a content life cycle (e.g. saving the data to the Content Repository, than to the security db, than indexing, etc.), while an audit log entry is a single record in the audit log about a content that was created.

The data we collect

We try to place SnTrace calls on every important point in the system: subsystem events (e.g. messaging or security), content life cycle (milestones of content creation like indexing), executed queries, db operations, etc. We organized these events into different categories that can be switched on or off individually. For details see the Categories section below.

Another important task of SnTrace is performance measurement. It is based on a very simple API that is able to write the start and finish time of an operation (e.g. Save) as two separate lines to the trace. Both lines contain a correlation id and the finish line contains the duration too.

This technique is quick enough, the log lines are human readable and the generator code is very simple (see the examples below).

According to our measurements, the system deceleration rate is 1% on a loaded web server and high verbosity logging. Still, it is advisable to switch tracing on when needed and switch it off after.

Where does the data go

Currently we save trace data to physical files to the file system. In the future there will be different providers to make this more flexible - e.g. one that sends everything to a monitoring tool directly instead of the file system. For details please check the Trace files section below.

To ensure the velocity of the module, SnTrace does the following:

  • Buffered writing: we do not write every single line when they come in. There is a configurable in-memory buffer that gets flushed to text files from time to time (see the Buffering section below).
  • After a configured number of lines new files are opened to ensure that they do not become too big.
  • Files are written locally on every web server instead of a shared drive.
  • We use a very simple data structure.

Buffering

Sense/Net ECM can generate a surprising amount of data via SnTrace. Because of this the list of logged categories need to be chosen at the start of every debugging or measuring session carefully (this will result in a less noisy log). To avoid writing to the log files too frequently, we created an in-memory buffer that collects the incoming messages and the lines will be written to the file system in one step once per second (configurable). The default buffer size is huge (10000 lines) and writing frequency is quick (1 sec) but it is possible that even this is not enough. In this case the log will contain a warning line like this:

BUFFER OVERRUN ERROR: Buffer size is 10000, unwritten lines : 1234

Buffer overrun is not a real error, just indicates that the configuration needs to be adjusted. In this case the latest incoming line overwrites the oldest one (in other words some log lines will be forgotten). To avoid this the buffer size needs to be increased (see below).

Sometimes the SnTrace module places an information line into the log about the peek of the used buffer size in the following format:

MaxPdiff: 690

This means: in this session the peak of the buffer position difference was 690 between two flush operations. The "MaxPdiff" line is the "volume meter" to developers and operators.

Trace files

The files are written into the following directory:

.\App_Data\DetailedLog

File names contain the UTC creation date of the file:

detailedlog_20160424-144300Z.log

In this version there is no cleanup mechanism so the files are retained until deleted manually.

Files contain generalized log entries in a tab separated format. This format ensures that the file can be quickly processed: open the file, copy all and paste into a new Excel worksheet.

Line  Time                       Category  ApplicationName          Thread  Op      State  Duration         Message
----- -------------------------  --------  -----------------------  ------  ------  -----  ---------------  -------------------------------
>2461 2016-04-25 07:07:10.71858  Database  A:/LM/W3SVC/2/ROOT-2...  T:357                                   Transaction BEGIN: 91.
2462  2016-04-25 07:07:10.71858  Database  A:/LM/W3SVC/2/ROOT-2...  T:357   Op:725  Start                   SqlProcedure.ExecuteNonQuery...
2463  2016-04-25 07:07:10.73960  Database  A:/LM/W3SVC/2/ROOT-2...  T:357   Op:725  End    00:00:00.021022  SqlProcedure.ExecuteNonQuery...

Columns

  • Line: Number of the line. The '>' character at the start of the line means the beginning of a block.
  • Time: Time when the log entry arrived.
  • Category: Category of the log entry (see Categories below).
  • ApplicationName: Helps identifying the Application if more web servers' logs are merged and sorted by time.
  • Thread: Identifier of the thread.
  • Op: Correlation identifier of the measured operation. Connects the start and end lines.
  • State: Value can be "Start", "End" or "UNSUCCESFUL" if the line belongs to a measured operation (see Measuring operations below).
  • Duration: Duration of the operation. Only operation end writes this value.
  • Message: Raw message provided by the developer.

Categories

There are a number of categories that help classifying the entries. Categories represent the main components of Sense/Net ECM. They can be switched on or off individually to be included in the log - see the Settings and configuration section below for details.

  • ContentOperation: Records content related operations e.g. content saving, templated creation, checkout etc.
  • Database: Many sql procedures and transactions are measured. Deadlocks are also logged.
  • Index: Traces the indexing process including indexing activity queue.
  • Query: Measures content queries and writes some kernel information.
  • Repository: Writes repository system level messages, start and shut down information.
  • Messaging: Helps monitoring the messaging subsystem (e.g. receiving and sending messages).
  • Security: Security related operations and events.
  • Web: Writes the requested url and HTTP method.
  • Workflow: Records background operations of the workflow subsystem.
  • TaskManagement: Records any operations in connection with the TaskManagement subsystem including the TaskManagementWeb, Service and Agent too.
  • Test: Unit testing uses this category. Measures the execution time of every test method and the whole test session.
  • Event: includes events sent through the regular Logging API (see the Event logging section below).
  • Custom: the core product does not use this category, it is reserved for 3rd party developers.

Dynamic categories

(from version 6.5.5)

It is possible to write to user defined categories via calling the Category(name) method. This method returns with a fully functional, dynamic category created on-the-fly. For example:

SnTrace.Category("MyCategory").Write("Custom text");

The result in the trace log:

Line Time          Category   ApplicationName  Th   Op  State Duration Message
---- ------------- ---------- ---------------- ---- --- ----- -------- ----------------
1... 2016-04-27... MyCategory A:UnitTestAda... T:6                     Custom text

The dynamic category can be switched on- or off on the created instance but the initial state is bound to the Enabled property of the Custom category. Except for the main switch, the dynamic category knows everything that the built-in copies so it can write messages and errors and can handle operations too. In the following example the 2nd line is not written to the output because the category is switched off temporarily.

var myCategory = SnTrace.Category("MyCategory");
myCategory.Enabled = true;
myCategory.Write("Message-1");
myCategory.Enabled = false;
myCategory.Write("Message-2");
myCategory.Enabled = true;
myCategory.Write("Message-3");

Result:

Line Time          Category   ApplicationName  Th   Op  State Duration Message
---- ------------- ---------- ---------------- ---- --- ----- -------- ----------------
1... 2016-04-27... MyCategory A:UnitTestAda... T:6                     Message-1
1... 2016-04-27... MyCategory A:UnitTestAda... T:6                     Message-3

Event logging

When you want to monitor an application, errors and warnings are among the most important information so these log entries should not be missing from the detailed log. You can include these entries in the verbose log by switching on the Event category (see above).

Because these events may contain a huge amount of text, SnTrace saves only the message of the log entry. If you want to connect the trace line to the full event message (that can be found in the Event log), use the GUID added to every event line in the trace. The same GUID is also included in the related event data (extended properties section, SnTrace property).

SnTrace - '''#01d500aa-b8c8-423e-8a4a-ecedafd0c936'''

In the detailed log the association id is in the Message after the event classifier:

 Line  Time                       Category  ...   Message
 ----- -------------------------  --------  ...   -------------------------------
 5614  2016-04-27 01:13:10.54725  Event     ...   ERROR '''#01d500aa-b8c8-423e-8a4a-ecedafd0c936''': Invalid content query

Writing custom entries

Although there are a number of built-in categories used by the core product (see the list above), we strongly recommend that you do not use those in your custom code. Please use the Custom category instead (or a dynamic one, as seen above) to avoid confusion. Custom is the default category: writing a trace message without providing a category uses this Custom category. The following lines are equivalent:

SnTrace.Custom.Write("Simple message");
SnTrace.Write("Simple message");

Single line

Recording a simple event or writing a one-row information can be achieved by calling the Write method:

To use this API you will have to add a reference to the SenseNet.Tools.dll library.

var content = Content.CreateNew("Car", TestRoot, Guid.NewGuid().ToString());
content.Save();
SnTrace.Write("Test content created. Id: {0}", content.Id);

The result of the code above:

Line Time                       Category ApplicationName  Th   Op    State        Duration         Message
---- -------------------------  -------- ---------------- ---- ----- ------------ --------------- -------------------------------
>1   2016-04-27 05:15:54.16235  Custom   A:UnitTestAda... T:6                                     Test content created. Id: 1178

Measuring operations

An incredibly useful feature of tracing is that it contains a simple API for measuring how much time it takes to execute a block of code (not necessarily a whole method). This 'operation trace' api generates two trace lines: one for the start, one for the end. The Op and the State columns in the log help you identify the start and the end of an operation, and the Duration column will contain the time it took.

using (var op = SnTrace.StartOperation("Measured outer block"))
{
    Thread.Sleep(42);
    using (var op1 = SnTrace.StartOperation("Measured inner block"))
    {
        Thread.Sleep(42);
        //op1.Successful = true;
    }
    Thread.Sleep(42);
    op.Successful = true;
}

As you can see, developers have to explicitly state that the operation was successful by setting the Successful flag on the operation object. If this does not happen (as in case of the inner block in the code above - see the commented line), the operation is considered unsuccessful and this gets written into the log.

For developers: this technique helps the system work even in case of 'hidden' program flows (e.g. exceptions or returns from middle of a method).

Line Time                       Category ApplicationName  Th   Op    State        Duration         Message
---- -------------------------  -------- ---------------- ---- ----- ------------ --------------- ---------------------
>1   2016-04-27 05:15:54.16335  Custom   A:UnitTestAda... T:6  Op:1  Start                        Measured outer block
2    2016-04-27 05:15:54.20636  Custom   A:UnitTestAda... T:6  Op:2  Start                        Measured inner block
3    2016-04-27 05:15:54.25036  Custom   A:UnitTestAda... T:6  Op:2  UNTERMINATED 00:00:00.044005 Measured inner block
4    2016-04-27 05:15:54.29336  Custom   A:UnitTestAda... T:6  Op:1  End	  00:00:00.130007 Measured outer block

Writing errors

There is a simple way to write an error line into the log that can be found easily. This is not a complete error log with detailed stack trace but a simple message line with the ERROR keyword in the State column:

SnTrace.WriteError("Error in execution.");

And the result:

Line Time                       Category ApplicationName  Th   Op    State  Duration        Message
---- -------------------------  -------- ---------------- ---- ----- ------ --------------- ---------------------
>1   2016-04-27 05:15:53.18323  Custom   A:UnitTestAda... T:6        ERROR                  Error in execution.

If you want a complete error log, please use the regular Logging API.

Settings and configuration

SnTrace was designed simplicity in mind so that you can switch it off or on easily. This is why we use the Settings feature of Sense/Net ECM for that. More complex configuration values are still stored in the web.config.

Settings

The settings file can be found in the global Settings folder:

  • /Root/System/Settings/Logging.settings

It contains switches for all the categories so that you can switch them on or off individually:

Trace: {
	ContentOperation: false,
	Database: false,
	Index: false,
	Query: false,
	Repository: false,
	Messaging: false,
	Security: false,
	Web: false,
	Workflow: false,
	Test: false,
	Event: false,
	Custom: false,
}

Configuration

You can fine-tune the behavior of the detailed logger by modifying the value of the following configuration items in the web.config.

  • BufferSize: Allocated line buffer. Default: 10000.
  • WriteToFileDelay: Time between end of previous and start of next writing in milliseconds. Default: 1000.
  • MaxWritesInOneFile: Number of writes into a single file. Default: 100. After this a new log file will be created.
  • LinesPerTrace: Frequency of writing information about the state of the detailed logger to the log. Default: 1000.
<sensenet>
    <detailedLogger>
      <add key="BufferSize" value="10000" />
      <add key="WriteToFileDelay" value="1000" />
      <add key="MaxWritesInOneFile" value="100" />
      <add key="LinesPerTrace" value="1000" />
    </detailedLogger>
</sensenet>


Performance considerations

There are couple of things to consider when using SnTrace as a developer or operator:

  • Do not keep all categories switched on for a long time (it generates to much data anyway that will be hard to process).
  • Do not use complex information gathering when you write to the trace, it may effect time measurement and slow down the portal. If it is necessary, test whether the category (Custom) is enabled or not.
  • Delete log files on the server from time to time, they may take up a lot of space.
  • Do not open an active file (during measuring) to prevent IO errors.

Known issues

  • Smart formatting does not work yet (e.g. IEnumerable values are not written correctly) - please work with simple strings.
  • Sometimes one or more rows change places in the log during high load. This may happen close to the block start or end.
  • Opening an active file may cause an error, but only in the opening application (e.g. Notepad). It does not cause any errors in the trace process.

Related links

References

There are no external references for this article.