PowerShell 使用SqlScriptDOM对T-SQL做规则校验
对于数据项目来说,编写Sql是一项基本任务同时也是数量最多的代码。为了统一项目代码规范同时降低Code Review的成本,因此需要通过自动化的方式来进行规则校验。由于本人所在的项目以SQL Server数据库为基础,于是本人决定通过使用SqlScriptDom类库来做T-SQL的规则校验。如果是其他数据库项目,则可采用ANTLR库做规则校验,其实现的方式大体一致。
SqlScriptDom是针对SQL Server的.Net的类库,由微软公司开发并开源,
- 使用脚本开发比较灵活,不用编译,开发即可部署。
- Powershell可以直接使用.Net类库,并且具有高级语言的一些特点方便开发。
项目使用VS Code作为开发调试工具,需要安装Powershell相关的插件。由于要使用到PowerShell的自定义类来开发,所以需要提前将类库加载到Powershell中,因此需要配置Powershell的环境。如何配置环境可以参考这篇文章,关于配置文件 - PowerShell | Microsoft Learn,
以下是具体代码
1 using namespace Microsoft.SqlServer.TransactSql.ScriptDom 2 using namespace System 3 using namespace System.Collections.Generic 4 using namespace System.IO 5 using namespace Management.Automation 6 using namespace System.Reflection 7 8 enum Severity { 9 Information = 1 10 Warning = 2 11 Exception = 3 12 Fault = 4 13 } 14 15 enum ResponseCode { 16 Success = 0 17 Exception = 10001 18 ParseError = 10002 19 }
using namespace Microsoft.SqlServer.TransactSql.ScriptDom
这句是使用了命名空间,后面在使用相关对象时候无需采用完全限定名,从而简化代码。随后定义了两个枚举,Severity定义规则的严重程度,ResponseCode定义在程序处理过程中的各种状态。
下面定义CustomerParser类,该类的功能是接收输入的Sql代码,通过语法和词法分析后生成相关语法树,再对语法树进行分析,从而判断代码中哪些片段是违反了项目的编码规则,从而达到Code Review的作用。
1 class CustomParser { 2 3 hidden [TSqlParser] $TSqlParser 4 hidden [TSqlFragment]$Tree 5 hidden $AnalysisCodeSummary = [PSCustomObject]([ordered]@{ 6 ResponseCode = [ResponseCode]::Success; 7 ResponseMessage = "Success"; 8 FileName = $null; 9 DocumentName = $null; 10 Code = $null; 11 IsDocument = $true; 12 ParseErrors = [List[ParseError]]::new(); 13 ValidationResults = [List[psobject]]::new(); 14 }) 15 16 hidden [bool] $IsDocument 17 hidden [string] $FileName 18 hidden [string] $Code 19 20 hidden CustomParser([SqlVersion]$version, [SqlEngineType]$engineType) { 21 switch ($version) { 22 [SqlVersion]::Sql120 { $this.TSqlParser = [TSql120Parser]::new($true) } 23 [SqlVersion]::Sql130 { $this.TSqlParser = [TSql130Parser]::new($true, $engineType) } 24 [SqlVersion]::Sql140 { $this.TSqlParser = [TSql140Parser]::new($true, $engineType) } 25 [SqlVersion]::Sql150 { $this.TSqlParser = [TSql150Parser]::new($true, $engineType) } 26 Default { $this.TSqlParser = [TSql160Parser]::new($true, $engineType) } 27 } 28 } 29 30 hidden [void] Parse() { 31 $this.AnalysisCodeSummary.FileName = $this.FileName 32 $this.AnalysisCodeSummary.IsDocument = $this.IsDocument 33 $this.AnalysisCodeSummary.DocumentName = [Path]::GetFileName($this.FileName) 34 35 [StringReader]$reader = $null 36 [ParseError[]]$errors = @() 37 38 try { 39 if ($this.IsDocument) { $this.Code = [File]::ReadAllText($this.FileName) } 40 $this.AnalysisCodeSummary.Code = $this.Code 41 $reader = [StringReader]::new($this.Code) 42 $this.Tree = $this.TSqlParser.Parse($reader, [ref] $errors) 43 } 44 catch { 45 $this.AnalysisCodeSummary.ResponseCode = [ResponseCode]::Exception 46 $this.AnalysisCodeSummary.ResponseMessage = $_.Exception.Message 47 return 48 } 49 finally { 50 if ($null -ne $reader) { $reader.Close() } 51 } 52 53 if ($errors.Count -ne 0) { 54 $this.AnalysisCodeSummary.ResponseCode = [ResponseCode]::ParseError 55 $this.AnalysisCodeSummary.ResponseMessage = "An error occurred while parsing the code." 56 $this.AnalysisCodeSummary.ParseErrors = $errors 57 } 58 } 59 60 hidden [void]Validate([BaseRule] $rule, [bool]$lockRule) { 61 [psobject]$validationResult = [PSCustomObject]([ordered]@{ 62 ResponseCode = [ResponseCode]::Success; 63 ResponseMessage = "Success"; 64 RuleName = $rule.RuleName; 65 Descrtiption = $rule.Descrtiption; 66 Severity = $rule.Severity; 67 Validated = $true; 68 AnalysisCodeResults = @(); 69 }) 70 $lockTaken = $false 71 try { 72 if ($lockRule) { [Threading.Monitor]::Enter($rule.AnalysisCodeResults, [ref] $lockTaken) } 73 $rule.AnalysisCodeResults = @() 74 $this.Tree.Accept($rule) 75 $validationResult.AnalysisCodeResults += $rule.AnalysisCodeResults 76 } 77 catch { 78 $validationResult.ResponseCode = [ResponseCode]::Exception 79 $validationResult.ResponseMessage = $_.Exception.Message 80 return 81 } 82 finally { 83 if ($lockTaken) { [Threading.Monitor]::Exit($rule.AnalysisCodeResults) } 84 $validationResult.Validated = $validationResult.ResponseCode -eq [ResponseCode]::Success ` 85 -and (( $validationResult.AnalysisCodeResults | Where-Object { -not $_.Validated } ).Count -eq 0) 86 87 if (-not $validationResult.Validated) { 88 $this.AnalysisCodeSummary.ValidationResults += $validationResult 89 } 90 } 91 } 92 93 static [psobject] Analysis([string]$codeOrFile, [bool]$isDocumnet, [BaseRule[]]$rules) { 94 [CustomParser]$parser = [CustomParser]::new([SqlVersion]::Sql130, [SqlEngineType]::All) 95 if (-not $isDocumnet) { $parser.Code = $codeOrFile }else { $parser.FileName = $codeOrFile } 96 $parser.IsDocument = $isDocumnet 97 $parser.Parse() 98 if ($parser.AnalysisCodeSummary.ResponseCode -eq [ResponseCode]::Success) { 99 foreach ($rule in $rules) { 100 $parser.Validate($rule, $false) 101 } 102 } 103 return $parser.AnalysisCodeSummary 104 } 105 106 static [psobject[]] Analysis([string[]]$files, [BaseRule[]]$rules) { 107 $result = @() 108 foreach ($file in $files) { $result += [CustomParser]::Analysis($file, $true, $rules) } 109 return $result 110 } 111 }
hidden [TSqlParser] $TSqlParser 该变量是T-SQL的分析器,通过该变量的Parse方法将SQL解析成语法树,hidden表示该变量仅在类内部使用。
hidden [TSqlFragment]$Tree 该变量则存储分析后的语法树
1 hidden $AnalysisCodeSummary = [PSCustomObject]([ordered]@{ 2 ResponseCode = [ResponseCode]::Success; 3 ResponseMessage = "Success"; 4 FileName = $null; 5 DocumentName = $null; 6 Code = $null; 7 IsDocument = $true; 8 ParseErrors = [List[ParseError]]::new(); 9 ValidationResults = [List[psobject]]::new(); 10 })
该变量是存储语法分析和规则分析的结果。ParseErrors列表存储的是当语法分析出错时的错误结果。ValidationResults列表则存储的是每条规则校验后的结果。
1 hidden CustomParser([SqlVersion]$version, [SqlEngineType]$engineType) { 2 switch ($version) { 3 [SqlVersion]::Sql120 { $this.TSqlParser = [TSql120Parser]::new($true) } 4 [SqlVersion]::Sql130 { $this.TSqlParser = [TSql130Parser]::new($true, $engineType) } 5 [SqlVersion]::Sql140 { $this.TSqlParser = [TSql140Parser]::new($true, $engineType) } 6 [SqlVersion]::Sql150 { $this.TSqlParser = [TSql150Parser]::new($true, $engineType) } 7 Default { $this.TSqlParser = [TSql160Parser]::new($true, $engineType) } 8 } 9 }
CustomParser类的构造函数,$version定义了使用那个版本的分析器,比如Sql130就对应Sql Server2016,$engineType定义了使用哪种类型的引擎,是Sql Server还是Azure亦或两者都采用。该类包含了两个方法,Parse方法是做语法分析的。Validate方法则是做规则校验,该方法的$rule参数是传入的各种验证规则,均继承自BaseRule类。$lockRule是当采用多线程执行时是否加锁来保证结果完整。
下面则是BaseRule的代码。
1 class BaseRule:TSqlFragmentVisitor { 2 3 [string]$Descrtiption 4 [Severity]$Severity = [Severity]::Information 5 $AnalysisCodeResults = @() 6 [string]$RuleName = $this.GetType().Name 7 hidden [string] $Additional 8 9 hidden [void] Validate([TSqlFragment] $node, [bool] $validated , [string] $addtional) { 10 $this.AnalysisCodeResults += [BaseRule]::GetAnalysisResult($node, $validated, $addtional) 11 } 12 13 static [BaseRule[]] GetAllRules() { 14 return [Assembly]::GetAssembly([BaseRule]).GetTypes() ` 15 | Where-Object { $_ -ne [BaseRule] -and $_.BaseType -eq [BaseRule] } ` 16 | ForEach-Object { New-Object $_ } 17 } 18 19 static [psobject] GetAnalysisResult([TSqlFragment] $node, [bool] $validated , [string] $addtional) { 20 return [PSCustomObject]([ordered]@{ 21 StartLine = $node.StartLine; 22 EndLine = if ($node.LastTokenIndex -gt 0) { $node.ScriptTokenStream[$node.LastTokenIndex].Line } else { $node.LastTokenIndex } 23 StartColumn = $node.StartColumn; 24 Validated = $validated; 25 Text = if ($node.FragmentLength -gt 0) ` 26 { $node.ScriptTokenStream[$node.FirstTokenIndex..$node.LastTokenIndex].Text -join [string]::Empty } ` 27 else { $null } 28 Additional = $addtional 29 }) 30 } 31 }
它继承自TSqlFragmentVisitor,Validate方法用来解析被规则命中的语法节点,并记录该节点在代码中的详情,如该节点在代码中的开始行,结束行,代码段等,方便定位相关的Sql代码。同时将这些记录添加到AnalysisCodeResults列表,并将该列表的数据添加到CustomParser类中的ValidationResults列表中。具体规则通过重写基类的Visit方法来实现代码分析。此外还定义了一个静态方法GetAllRules用以获取项目中所有的规则。以上便是整个项目的核心,接下来将介绍一些具体样例。
首先做一个简单的例子,比如我们规定在Select中不能包含星号(*)。代码如下:
1 class PDE001: BaseRule { 2 PDE001() { 3 $this.Descrtiption = "Asterisk in select list." 4 $this.Severity = [Severity]::Warning 5 } 6 7 [void] Visit([SelectStarExpression] $node) { 8 $this.Validate($node, $false, $null) 9 } 10 }
够简单了吧,首先继承自BaseRule类,然后重写Visit方法。由于Visit被重载了很多,我们选择参数类型为SelectStarExpression的方法,当语法树中存在这个节点的时候,我们调用基类的$this.Validate($node, $false, $null)方法,并记录了该节点的详情,这样就代表Sql代码没能通过该条规则。比如我们写下这样一条Sql,SELECT * FROM TEST 然后通过调用来看下执行结果,可以看到规则被命中,Validated属性为False,表示验证没通过。
接下来我再讲一条比较复杂的规则。比如我们在做数据操作的时,为了降低对资源的占用时间。我们不能直接插入,删除或者更新大批量数据,这是就需要将数据分成小批量,然后通过循环的方式来处理。为了阻止这种大批量数据的操作,我们需要制定该规则。当然该规则也会有一些特例,如被处理的对象是表变量或者临时表,则可以忽略该规则。以下是该规则的代码实现
1 class PDE003:BaseRule { 2 PDE003() { 3 $this.Descrtiption = "You should use batch operations in statements." 4 $this.Severity = [Severity]::Exception 5 } 6 7 hidden [int]$start = 0 8 hidden [int]$end = 0 9 10 [void] Visit([UpdateDeleteSpecificationBase]$node) { 11 $target = $node.Target 12 13 if ($target -is [VariableTableReference]) { return } 14 if ($this.CheckWhile($node)) { return } 15 [NamedTableReference] $namedTableReference = $target -as [NamedTableReference] 16 $targetTable = $namedTableReference.SchemaObject.BaseIdentifier.Value 17 18 if ($targetTable -imatch "^#{1,2}") { return } 19 20 $fromClause = $node.FromClause 21 if ($null -ne $fromClause) { 22 [TemporaryTableVisitor]$tempVisitor = [TemporaryTableVisitor]::new($fromClause, $targetTable) 23 $fromClause.AcceptChildren($tempVisitor) 24 if ($tempVisitor.Validated) { return } 25 } 26 $this.Validate($node, $false, $null) 27 } 28 29 [void] Visit([InsertSpecification]$node) { 30 $target = $node.Target 31 if ($target -is [VariableTableReference]) { return } 32 if ($this.CheckWhile($node)) { return } 33 $namedTableReference = $target -as [NamedTableReference] 34 if ($namedTableReference.SchemaObject.BaseIdentifier.Value -imatch "^#{1,2}") { return } 35 $valuesInsertSource = $node.InsertSource -as [ValuesInsertSource] 36 if ($null -ne $valuesInsertSource) { return } 37 38 $this.Validate($node, $false, $null) 39 } 40 41 [void] Visit([MergeSpecification]$node) { 42 $target = $node.Target 43 if ( $this.CheckWhile($node)) { return } 44 if ($target -is [VariableTableReference]) { return } 45 $namedTableReference = $target -as [NamedTableReference] 46 if ($namedTableReference.SchemaObject.BaseIdentifier.Value -imatch "^#{1,2}") { return } 47 $this.Validate($node, $false, $null) 48 49 } 50 51 [void] Visit([WhileStatement]$node) { 52 $this.start = $node.StartLine 53 $this.end = $node.ScriptTokenStream[$node.LastTokenIndex].Line 54 } 55 56 hidden [bool] CheckWhile([TSqlFragment] $node) { 57 return $node.StartLine -ge $this.start -and $node.ScriptTokenStream[$node.LastTokenIndex].Line -le $this.end 58 } 59 } 60 61 class TemporaryTableVisitor:TSqlFragmentVisitor { 62 63 [bool]$Validated = $false 64 hidden [string] $pattern = "^(@|#{1,2})" 65 hidden [FromClause]$fromClause 66 hidden [string]$target 67 68 TemporaryTableVisitor([FromClause]$fromClause, [string]$target) { 69 $this.fromClause = $fromClause 70 $this.target = $target 71 if ($null -eq $fromClause) { $this.Validated = $true } 72 } 73 74 [void] Visit([NamedTableReference]$node) { 75 $tableName = $node.SchemaObject.BaseIdentifier.Value 76 $alias = $node.Alias.Value 77 if ($this.target -in $alias, $tableName) { 78 $this.Validated = $this.Validated -or ($tableName -imatch $this.pattern) 79 } 80 } 81 82 [void] Visit([VariableTableReference]$node) { 83 $tableName = $node.Variable.Name 84 $alias = $node.Alias.Value 85 if ($this.target -in $alias, $tableName) { 86 $this.Validated = $this.Validated -or ($tableName -imatch $this.pattern) 87 } 88 } 89 }
该类还引用了另外一个辅助类,辅助类是处理当前节点为Insert、Update、Delete和Merge语句的时候,获取该节点的FROM节点中中的表对象,并判断该表是否属于临时表或者表变量且用作目标表,如果是则忽略该规则。
当直接输入 DELETE A FROM TEST1 A INNER JOIN TEST2 B ON A.ID=B.ID,我们可以看到规则阻挡了该语句,这时Validated属性为false。
当我们代码变成 DELETE A FROM #TEST1 A INNER JOIN TEST2 B ON A.ID=B.ID,意味着我们更新的目标表是临时表时,规则通过了该段代码,且Validated属性为true。
当我们在DELETE A FROM TEST1 A INNER JOIN TEST2 B ON A.ID=B.ID 语句加上WHILE,意味着我们要循环处理数据,恭喜通过了该规则的验证。
以下是客户端调用的代码
1 using module '.\Code Analysis\Rule.psm1' 2 3 $files = Get-ChildItem -Path "E:\BackupE\QueryFile" -Filter "*.sql" -File 4 $rules = [BaseRule]::GetAllRules() 5 $result = [CustomParser]::Analysis($files.FullName, $rules) 6 $result.Where({ $_.ResponseCode -eq [ResponseCode]::Success -and $_.ValidationResults.Where({ -not $_.Validated }).Count -gt 0 }) |` 7 Select-Object -Property FileName, DocumentName -ExpandProperty ValidationResults |` 8 Select-Object -ExpandProperty AnalysisCodeResults -ExcludeProperty Validated , AnalysisCodeResults
自此,整个代码就介绍完了,如果需要代码的话可以到以转到以下地址