Debugging SharePoint Workflows
May 5, 2009
This blog post is about tips and tricks for monitoring the health of SharePoint workflows. I will like to suggest this excellent MSDN article for additional information.
Logging information about the progress of workflows
The WF tracking service logs the events as the workflow progresses along. For example, consider a simple workflow (see below) that creates a task and then loops until the task is 100% complete.
By turning the tracking service on (see [1]) we can capture entries like the following, the trace output below corresponds to the workflow in Figure 1 above:
System.Workflow.Runtime.Hosting Information: 0 : Creating instance 1888f8e6-145c-4220-be52-99cfd09098a7
System.Workflow.Runtime Information: 1 : Workflow Runtime: Scheduler: InstanceId: 1888f8e6-145c-4220-be52-99cfd09098a7 : Running scheduled entry: SubscriptionEvent((1)Workflow1, ActivityStatusChange(‘(1)createTask1‘, Closed, Succeeded))
System.Workflow.Runtime Information: 0 : Activity Status Change – Activity: whileActivity1 Old:Initialized; New:Executing
// At this point the workflow is waiting for the user to update the task. So the workflow runtime can hydrate the running instance to the database
System.Workflow.Runtime Information: 0 : Workflow Runtime: WorkflowExecutor: Got an unload request for instance 1888f8e6-145c-4220-be52-99cfd09098a7
System.Workflow.Runtime Information: 0 : 1888f8e6-145c-4220-be52-99cfd09098a7: Calling PerformUnloading(false) on instance 1888f8e6-145c-4220-be52-99cfd09098a7 hc 13970169
System.Workflow.Runtime Information: 0 : Workflow Runtime: WorkflowExecutor: Unloading instance 1888f8e6-145c-4220-be52-99cfd09098a7
System.Workflow.Runtime.Hosting Information: 0 : TimerEventSubscriptionQueue: 1888f8e6-145c-4220-be52-99cfd09098a7 Suspend
System.Workflow.Runtime Information: 0 : 1888f8e6-145c-4220-be52-99cfd09098a7: Calling Persist
// At this point the user marks the task as complete . So the workflow runtime can deserialize the workflow and pass it the TaskChanged event. This results in re-evaluation of the while loop
System.Workflow.Runtime Stop: 0 : Workflow Trace
System.Workflow.Runtime.Hosting Information: 0 : Deserialized a Workflow1 [SampleWorkflow.Workflow1] to length 8660. Took 00:00:00.0400576.
System.Workflow.Runtime Information: 0 : Workflow Runtime: WorkflowExecutor: Loading instance 1888f8e6-145c-4220-be52-99cfd09098a7
System.Workflow.Runtime Information: 0 : Activity Status Change – Activity: onTaskChanged1 Old:Executing; New:Closed
System.Workflow.Runtime Information: 1 : Workflow Runtime: Scheduler: InstanceId:
1888f8e6-145c-4220-be52-99cfd09098a7 : Scheduling entry: SubscriptionEvent((1)whileActivity1, ActivityStatusChange(‘(2)onTaskChanged1′, Closed, Succeeded))
As you can see from the snippets above, we have information about step by step execution of the workflow.
Workflow Failure Conditions
Some of the common reasons why a workflow can fail include:
a. The WF program instance errors out because of an exception. For example, if a null object reference is encountered inside the CreateTask handler, in the workflow in Figure 1, above. This exception will cause the WF instance to move to the error state (reflected in the status column of the document library)
The tracking service will record this exception and typically log detailed information as shown below:
System.Workflow.Runtime Critical: 0 : Uncaught exception escaped to the root of the workflow.
In instance 9e644d58-9990-443a-a595-1685fec2c311 in activity
Inner exception: System.NullReferenceException: Object reference not set to an instance of an object.
at SampleWorkflow.Workflow1.TaskCreation(Object sender, EventArgs e)
at System.Workflow.ComponentModel.Activity.RaiseEvent(DependencyProperty dependencyEvent, Object sender, EventArgs e)
at System.Workflow.Activities.CallExternalMethodActivity.Execute(ActivityExecutionContext executionContext)
at System.Workflow.ComponentModel.ActivityExecutor`1.Execute(T activity, ActivityExecutionContext executionContext)
at System.Workflow.ComponentModel.ActivityExecutor`1.Execute(Activity activity, ActivityExecutionContext executionContext)
at System.Workflow.ComponentModel.ActivityExecutorOperation.Run(IWorkflowCoreRuntime workflowCoreRuntime)
at System.Workflow.Runtime.Scheduler.Run()
System.Workflow.Runtime Information: 0 : Workflow Runtime: WorkflowExecutor: Terminating instance 9e644d58-9990-443a-a595-1685fec2c311
One more point about exception conditions – there are times when we need to thrown an exception ourselves. For instance, if the user does not have appropriate permission, or there is missing data. In those cases, we could set a custom error message (See [2])
b. The Application pool is recycled. The state from a previous persist point (if there was one – typically a delay activity or OnTaskChanged activity) is persisted in the database, but since there is no retry mechanism, there is no way to re-start the persisted workflow instance. For example, if the sample WF program instance was executing inside the While activity (whileActivity1 in Figure 1) when the app pool crashed, there is no automatic way to have the workflow restarted.
One potential solution would be to model the workflow as a state machine and include retry logic. But this would add complexity to the workflow.
c. If the correlation token is being set dynamically, there is a chance that the value gets incorrectly set in some cases. As a result, the waiting WF program instance will never receive the event. A correlation token is an identifier WF uses to tie activities to a common task – for example if CreateTask, OnTaskChanged and CompleteTask relate to a single task; they should have the same correlation token.
d. SharePoint workflow activities (like Create Task) delay database commits until a persist point is reached. This means that a CreateTask activity will not result in a “real” task being added to the list until a persist point is reached. So any direct SharePoint OM calls that attempt to reference the created task will fail until the point a persist point is reached. Please refer to [3] to dump out the
[1] Workflow Diagnostics
Add the following section to the web.config:
<system.diagnostics>
<switches>
<add name=“System.Workflow LogToTraceListeners“ value=“1“ />
<add name=“System.Workflow.Runtime.Hosting“ value=“All“ />
<add name=“System.Workflow.Runtime“ value=“All“ />
<add name=“System.Workflow.Runtime.Tracking“ value=“All“ />
<add name=“System.Workflow.Activities“ value=“All“ />
</switches>
<trace autoflush=“true“ indentsize=“4“>
<listeners>
<add name=“customListener“
type=“System.Diagnostics.TextWriterTraceListener“
initializeData=“WFTrace.log“ />
</listeners>
</trace>
</system.diagnostics>
Additionally, we can use stsadm to capture the trace messages from Workflow Infrstructure as shown below:
@echo off
set SPAdminTool=%CommonProgramFiles%\Microsoft Shared\web server extensions\12\BIN\stsadm.exe
rem echo Logging levels before…
rem “%SPAdminTool%” -o listlogginglevels
echo Setting levels…
stsadm -o setlogginglevel -category “Workflow Features;Workflow Infrastructure” -tracelevel Verbose -windowslogginglevel Error
echo Restarting SPTrace service…
net stop sptrace
net start sptrace
rem echo Logging levels after…
rem “%SPAdminTool%” -o listlogginglevels
pause
[2] Adding custom status message
1. Add a custom status in workflow.xml
<ExtendedStatusColumnValues>
<StatusColumnValue>
Failed to start due to insufficient permissions
</StatusColumnValue>
</ExtendedStatusColumnValues>
2. Add the following code for the invoking method
private void setState1_MethodInvoking(object sender, EventArgs e)
{
((Microsoft.SharePoint.WorkflowActions.SetState)sender).State = ((Int32)SPWorkflowStatus.Max);
}
[3] Code to extract the persisted workflow state
Since the workflow runtime is finicky about changes such as adding private variables, it is useful to dump the persisted state.
System.Data.SqlClient.SqlCommand cmd = new System.Data.SqlClient.SqlCommand(
“select InstanceData from dbo.workflow where InstanceDataSize > 0″);
System.Data.SqlClient.SqlConnection conn = new System.Data.SqlClient.SqlConnection(
“Integrated Security=SSPI;Persist Security Info=False;Initial Catalog=WSS_Content_Portal;Data Source=CTSDEV1″);
conn.Open();
cmd.Connection = conn;
byte[] image = (byte[])cmd.ExecuteScalar();
System.IO.FileStream fs = new System.IO.FileStream(@”c:\data.gz”, System.IO.FileMode.CreateNew);
fs.Write(image, 0, image.Length);
fs.Flush();
fs.Close();
May 6, 2009 at 2:17 pm
[...] Debugging SharePoint Workflows (Vishwas Lele) [...]
May 22, 2009 at 12:06 am
Пора переименовать блог, присвоив название связанное с доменами
может хватит про них?
May 23, 2009 at 9:25 pm
Захватывающе. Зачет! и ниипет!
May 24, 2009 at 6:31 pm
Здравствуй! Спасибо за подаренные хорошие эмоции…
August 6, 2009 at 8:02 pm
[...] on WF tracing (see my previous post for more information). Trace logs provide a great source of information on how a given workflow [...]
November 6, 2009 at 5:07 am
Спасибо было интересно читать.