最近项目中遇到了一个奇怪的问题。在调用Nhibernate去更新数据库的时候不定期的会出现timeout exception:
NHibernate.HibernateException: An exception occurred when executing batch queries ---> System.Data.SqlClient.SqlException: Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding.

由于之前没有出现过此类问题而且代码和环境都没有变过,所以首先想到的是数据库的问题导致的。于是叫上DBA一起把数据库日志翻了一遍,然而并没发现什么异常,最终归结为数据库性能问题,他们也没什么办法…
此刻一句名言飘荡在我的脑海之中Fuck the world if you are rich, otherwise fuck yourself.
好吧,从程序上入手看看有没有什么办法规避或者解决这个问题。首先重新查看了我们自己出错的日志,发现发生executing batch queries异常的时间都在30~35 sec,而这个异常是指执行数据库语句时抛的异常。而我们在Nhibernate的配置文件hibernate.cfg.xml中所配置的command_timeout是300sec,<property name="command_timeout">300</property>。为什么这个配置没有生效呢?
在进行了海量搜索以及验证之后,终于fuck myself success。总结如下:

两个对应关系

  • Connect String中加入的ConnectTimeOut <<========>> ADO.NET中SqlConnection.ConnectionTimeout:等待连接打开所需的时间(以秒为单位)。 默认值为 15 秒。detail
  • Nhibernate中配置的IDbCommand的CommandTimeout <<========>> ADO.NET中SqlCommand.CommandTimeout:等待命令执行所需的时间(以秒为单位)。 默认值为 30 秒。detail

问题所在

通过搜索发现在StackOver上也有人遇到过同样的问题。总结了一下大概意思是说:在设置了10属性之后,NHibernate会去指定用ADO.net去做批量更新。而在NHibernate中对于批量更新用到的类是SqlClientBatchingBatcher,这个类中所用到的command timeout值是从Environment.GlobalProperties中取得的。而配置文件hibernate.cfg.xml中所设置的command timeout是会用于创建SqlClientDriver for SQL Server时用于初始化DriveBase类的。detail
为了验证这个说法,下载了NHibernate 2.1.2
SqlClientBatchingBatcher类中给command timeout赋值的代码如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
public SqlClientBatchingBatcher(ConnectionManager connectionManager, IInterceptor interceptor): base(connectionManager, interceptor)
{
batchSize = Factory.Settings.AdoBatchSize;
currentBatch = new SqlClientSqlCommandSet();
SetCommandTimeout();
//we always create this, because we need to deal with a scenario in which
//the user change the logging configuration at runtime. Trying to put this
//behind an if(log.IsDebugEnabled) will cause a null reference exception
//at that point.
currentBatchCommandsLog = new StringBuilder().AppendLine("Batch commands:");
}
private void SetCommandTimeout()
{
int timeout = PropertiesHelper.GetInt32(Cfg.Environment.CommandTimeout, Cfg.Environment.Properties, -1); ---> look here
//FileStream fs = new FileStream(@"batchCommandTimeOut.txt", FileMode.Create);
//StreamWriter sw = new StreamWriter(fs);
//sw.WriteLine("timeout = " + timeout.ToString());
//sw.Flush();
//sw.Close();
//fs.Close();
if (timeout > 0)
{
try
{
currentBatch.CommandTimeout = timeout; ---> look here
}
catch (Exception e)
{
if (log.IsWarnEnabled)
{
log.Warn(e.ToString());
}
}
}
}

可知取的是Cfg.Environment.CommandTimeout,而这个就对应的Global的配置。
而在DriveBase中对command timeout的赋值代码如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
public virtual IDbCommand GenerateCommand(CommandType type, SqlString sqlString, SqlType[] parameterTypes)
{
IDbCommand cmd = CreateCommand();
cmd.CommandType = type;
SetCommandTimeout(cmd, PropertiesHelper.GetInt32(Environment.CommandTimeout, Environment.Properties, -1));
SetCommandText(cmd, sqlString);
SetCommandParameters(cmd, parameterTypes);
return cmd;
}
private void SetCommandTimeout(IDbCommand cmd, object envTimeout)
{
if (commandTimeout >= 0)
{
//FileStream fs = new FileStream(@"DriveBaseCommandTimeOut.txt用来记录", FileMode.Create);
//StreamWriter sw = new StreamWriter(fs);
//sw.WriteLine("timeout = " + commandTimeout.ToString());
//sw.Flush();
//sw.Close();
//fs.Close();
try
{
cmd.CommandTimeout = commandTimeout;
}
catch (Exception e)
{
if (log.IsWarnEnabled)
{
log.Warn(e.ToString());
}
}
}
}

可知这里是将commandTimout直接赋值给了cmd.CommandTimeout,而commandTimeout是会读取hibernate.cfg.xml这个配置文件的。所以在hibernate.cfg.xml中配置的command timeout为300是会作用到这里。

解决办法

对于batch query timeout的异常,我们可以通过加长SqlClientBatchingBatcher的command timeout来解决。具体怎么加有两种方法:

  1. 在全局配置文件中配置NHibernate的command timeout,一般是App.config。

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    <?xml version="1.0" encoding="utf-8"?>
    <configuration>
    <configSections>
    <section name="hibernate-configuration" type="NHibernate.Cfg.ConfigurationSectionHandler, NHibernate"/>
    </configSections>
    <hibernate-configuration xmlns="urn:nhibernate-configuration-2.2" >
    <session-factory>
    <property name="adonet.batch_size">10</property>
    <property name="command_timeout">300</property>
    <property name="proxyfactory.factory_class">NHibernate.ByteCode.Castle.ProxyFactoryFactory, NHibernate.ByteCode.Castle</property>
    </session-factory>
    </hibernate-configuration>
    </configuration>
  2. 在程序中,通过反射来设置全局的command timeout。

    1
    2
    3
    4
    5
    FieldInfo field = typeof(global::NHibernate.Cfg.Environment).GetField("GlobalProperties", System.Reflection.BindingFlags.NonPublic | System.Reflection.BindingFlags.Static);
    Dictionary<string, string> gloablProperties = field.GetValue(null) as Dictionary<string, string>;
    gloablProperties.Add("command_timeout","300");

总结

在更改了全局配置之后,生产环境中的timeout得到了解决。所以有些时候还是要多从内部找问题的。