> 摘要:前一篇文章 介绍了关于 Lint 和 LintPass 的一些概念和实现。基于这些结构,提供了一个简易的 Lint 检查的实现方式。本文主要介绍 CombinedLintPass 这一结构的实现,并基于 CombinedLintPass 进一步优化 Lint 的实现。
背景
在 KusionStack 技术栈中, KCL 配置策略语言是重要的组成部分之一。为了帮助用户更好的编写 KCL 代码,我们也为 KCL 语言开发了一些语言工具,Lint 就是其中一种。Lint 工具帮助用户检查代码中潜在的问题和错误,同时也可以用于自动化的代码检查,保障仓库代码规范和质量。因为 KCL 语言由 Rust 实现,一些功能也学习和参考了 Rustc。本文是在学习 Rustc 过程中的一些思考和沉淀,在这里做一些分享。
前一篇文章 介绍了关于 Lint
和 LintPass
的一些概念和实现。基于这些结构,提供了一个简易的 Lint
检查的实现方式。本文主要介绍 CombinedLintPass
这一结构的实现,并基于 CombinedLintPass
进一步优化 Lint
的实现。
CombinedLintpass
Rustc 在 LintPass
的中实现了 Lint
工具检查的具体逻辑。并且使用 Visitor 模式遍历 AST 的同时调用 lintpass 中的 check_*
方法。
impl ast_visit::Visitor for Linter { fn visit_crate(a: ast:crate){ for lintpass in lintpasses{ lintpass.check_crate(a) } walk_crate(); } fn visit_stmt(a: ast:stmt){ for lintpass in lintpasses{ lintpass.check_stmt(a) } walk_stmt(); } ... }
但是,Rustc 自身和 clippy 提供的 Lint 定义多达550+多个。考虑到性能因素,定义大量的 LintPass,分别注册和调用显然是不合适的。Rustc 提供了一种更优的解决方法:既然可以将多个 Lint 组织为一个 LintPass,同样也可以将多个 LintPass 组合成一个 CombinedLintPass。
> Compiler lint passes are combined into one pass > Within the compiler, for performance reasons, we usually do not register dozens of lint passes. Instead, we have a single lint pass of each variety (e.g., BuiltinCombinedModuleLateLintPass) which will internally call all of the individual lint passes; this is because then we get the benefits of static over dynamic dispatch for each of the (often empty) trait methods. > Ideally, we’d not have to do this, since it adds to the complexity of understanding the code. However, with the current type-erased lint store approach, it is beneficial to do so for performance reasons.
BuiltinCombinedEarlyLintPass
CombinedLintPass 同样分为 early 和 late 两类。 以 builtin 的 early lint 为例,Rustc 在 rustc_lint::src::lib.rs
中为这些 lintpass 定义了一个 BuiltinCombinedEarlyLintPass
结构。
early_lint_passes!(declare_combined_early_pass, [BuiltinCombinedEarlyLintPass]);
虽然这个定义看起来只有一行,但其中通过若干个宏的展开,汇总了14个 LintPass
,并且每个 LintPass
提供了50多个 check_*
方法。接下来一一说明这些宏。
BuiltinCombinedEarlyLintPass 的宏定义
early_lint_passes
macro_rules! early_lint_passes { ($macro:path, $args:tt) => { $macro!( $args, [ UnusedParens: UnusedParens, UnusedBraces: UnusedBraces, UnusedImportBraces: UnusedImportBraces, UnsafeCode: UnsafeCode, AnonymousParameters: AnonymousParameters, EllipsisInclusiveRangePatterns: EllipsisInclusiveRangePatterns::default(), NonCamelCaseTypes: NonCamelCaseTypes, DeprecatedAttr: DeprecatedAttr::new(), WhileTrue: WhileTrue, NonAsciiIdents: NonAsciiIdents, HiddenUnicodeCodepoints: HiddenUnicodeCodepoints, IncompleteFeatures: IncompleteFeatures, RedundantSemicolons: RedundantSemicolons, UnusedDocComment: UnusedDocComment, ] ); }; }
首先是 early_lint_passes 宏,这个宏的主要作用是定义了所有的 early lintpass。这里的 lintpass 是成对出现的,:
左边为 lintpass 的 Identifier,:
右边为 lintpass 的constructor。所以会出现 EllipsisInclusiveRangePatterns::default()
和 DeprecatedAttr::new()
这种形式。early_lint_passes 会将定义的 early lintpass 和 第二个参数一起传递给下一个宏。 通过这个宏,之前的BuiltinCombinedEarlyLintPass
的定义被展开为:
declare_combined_early_pass!([BuiltinCombinedEarlyLintPass], [ UnusedParens: UnusedParens, UnusedBraces: UnusedBraces, UnusedImportBraces: UnusedImportBraces, UnsafeCode: UnsafeCode, AnonymousParameters: AnonymousParameters, EllipsisInclusiveRangePatterns: EllipsisInclusiveRangePatterns::default(), NonCamelCaseTypes: NonCamelCaseTypes, DeprecatedAttr: DeprecatedAttr::new(), WhileTrue: WhileTrue, NonAsciiIdents: NonAsciiIdents, HiddenUnicodeCodepoints: HiddenUnicodeCodepoints, IncompleteFeatures: IncompleteFeatures, RedundantSemicolons: RedundantSemicolons, UnusedDocComment: UnusedDocComment, ])
declare_combined_early_pass
macro_rules! declare_combined_early_pass { ([$name:ident], $passes:tt) => ( early_lint_methods!(declare_combined_early_lint_pass, [pub $name, $passes]); ) }
declare_combined_early_pass 宏接收 early_lint_passes宏传来的 name(BuiltinCombinedEarlyLintPass) 和 passes,并继续传递给 early_lint_methods 宏。 通过这个宏,BuiltinCombinedEarlyLintPass
的定义继续展开为:
early_lint_methods!(declare_combined_early_lint_pass, [pub BuiltinCombinedEarlyLintPass, [ UnusedParens: UnusedParens, UnusedBraces: UnusedBraces, UnusedImportBraces: UnusedImportBraces, UnsafeCode: UnsafeCode, AnonymousParameters: AnonymousParameters, EllipsisInclusiveRangePatterns: EllipsisInclusiveRangePatterns::default(), NonCamelCaseTypes: NonCamelCaseTypes, DeprecatedAttr: DeprecatedAttr::new(), WhileTrue: WhileTrue, NonAsciiIdents: NonAsciiIdents, HiddenUnicodeCodepoints: HiddenUnicodeCodepoints, IncompleteFeatures: IncompleteFeatures, RedundantSemicolons: RedundantSemicolons, UnusedDocComment: UnusedDocComment, ] ]);
early_lint_methods
macro_rules! early_lint_methods { ($macro:path, $args:tt) => ( $macro!($args, [ fn check_param(a: &ast::Param); fn check_ident(a: &ast::Ident); fn check_crate(a: &ast::Crate); fn check_crate_post(a: &ast::Crate); ... ]); ) }
early_lint_methods 宏在前一篇文章中也介绍过,它定义了 EarlyLintPass
中需要实现的 check_*
函数,并且将这些函数以及接收的参数 $args
传递给下一个宏。因为 BuiltinCombinedEarlyLintPass
也是 early lint 的一种,所以同样需要实现这些函数。 通过这个宏,BuiltinCombinedEarlyLintPass
的定义继续展开为:
declare_combined_early_lint_pass!( [pub BuiltinCombinedEarlyLintPass, [ UnusedParens: UnusedParens, UnusedBraces: UnusedBraces, UnusedImportBraces: UnusedImportBraces, UnsafeCode: UnsafeCode, AnonymousParameters: AnonymousParameters, EllipsisInclusiveRangePatterns: EllipsisInclusiveRangePatterns::default(), NonCamelCaseTypes: NonCamelCaseTypes, DeprecatedAttr: DeprecatedAttr::new(), WhileTrue: WhileTrue, NonAsciiIdents: NonAsciiIdents, HiddenUnicodeCodepoints: HiddenUnicodeCodepoints, IncompleteFeatures: IncompleteFeatures, RedundantSemicolons: RedundantSemicolons, UnusedDocComment: UnusedDocComment, ] ], [ fn check_param(a: &ast::Param); fn check_ident(a: &ast::Ident); fn check_crate(a: &ast::Crate); fn check_crate_post(a: &ast::Crate); ... ] )
declare_combined_early_lint_pass
macro_rules! declare_combined_early_lint_pass { ([$v:vis $name:ident, [$($passes:ident: $constructor:expr,)*]], $methods:tt) => ( #[allow(non_snake_case)] $v struct $name { $($passes: $passes,)* } impl $name { $v fn new() -> Self { Self { $($passes: $constructor,)* } } $v fn get_lints() -> LintArray { let mut lints = Vec::new(); $(lints.extend_from_slice(&$passes::get_lints());)* lints } } impl EarlyLintPass for $name { expand_combined_early_lint_pass_methods!([$($passes),*], $methods); } #[allow(rustc::lint_pass_impl_without_macro)] impl LintPass for $name { fn name(&self) -> &'static str { panic!() } } ) }
declare_combined_early_lint_pass宏是生成 BuiltinCombinedEarlyLintPass
的主体。这个宏中做了以下工作:
- 生成一个名为
BuiltinCombinedEarlyLintPass
的 struct,其中的属性为宏early_lint_passes
提供的 lintpass 的 identifier。 - 实现
fn new()
fn name()
和fn get_lints()
方法。其中new()
调用了early_lint_passes
提供的 lintpass 的 constructor。 - 调用宏
expand_combined_early_lint_pass_methods
,实现自身的check_*
方法。
通过这个宏,BuiltinCombinedEarlyLintPass
的定义变为:
pub struct BuiltinCombinedEarlyLintPass { UnusedParens: UnusedParens, UnusedBraces: UnusedBraces, UnusedImportBraces: UnusedImportBraces, UnsafeCode: UnsafeCode, AnonymousParameters: AnonymousParameters, EllipsisInclusiveRangePatterns: EllipsisInclusiveRangePatterns, NonCamelCaseTypes: NonCamelCaseTypes, DeprecatedAttr: DeprecatedAttr, WhileTrue: WhileTrue, NonAsciiIdents: NonAsciiIdents, HiddenUnicodeCodepoints: HiddenUnicodeCodepoints, IncompleteFeatures: IncompleteFeatures, RedundantSemicolons: RedundantSemicolons, UnusedDocComment: UnusedDocComment, } impl BuiltinCombinedEarlyLintPass { pub fn new() -> Self { Self { UnusedParens: UnusedParens, UnusedBraces: UnusedBraces, UnusedImportBraces: UnusedImportBraces, UnsafeCode: UnsafeCode, AnonymousParameters: AnonymousParameters, EllipsisInclusiveRangePatterns: EllipsisInclusiveRangePatterns::default(), NonCamelCaseTypes: NonCamelCaseTypes, DeprecatedAttr: DeprecatedAttr::new(), WhileTrue: WhileTrue, NonAsciiIdents: NonAsciiIdents, HiddenUnicodeCodepoints: HiddenUnicodeCodepoints, IncompleteFeatures: IncompleteFeatures, RedundantSemicolons: RedundantSemicolons, UnusedDocComment: UnusedDocComment, } } pub fn get_lints() -> LintArray { let mut lints = Vec::new(); lints.extend_from_slice(&UnusedParens::get_lints()); lints.extend_from_slice(&UnusedBraces::get_lints()); lints.extend_from_slice(&UnusedImportBraces::get_lints()); lints.extend_from_slice(&UnsafeCode::get_lints()); lints.extend_from_slice(&AnonymousParameters::get_lints()); lints.extend_from_slice(&EllipsisInclusiveRangePatterns::get_lints()); lints.extend_from_slice(&NonCamelCaseTypes::get_lints()); lints.extend_from_slice(&DeprecatedAttr::get_lints()); lints.extend_from_slice(&WhileTrue::get_lints()); lints.extend_from_slice(&NonAsciiIdents::get_lints()); lints.extend_from_slice(&HiddenUnicodeCodepoints::get_lints()); lints.extend_from_slice(&IncompleteFeatures::get_lints()); lints.extend_from_slice(&RedundantSemicolons::get_lints()); lints.extend_from_slice(&UnusedDocComment::get_lints()); lints } } impl EarlyLintPass for BuiltinCombinedEarlyLintPass { expand_combined_early_lint_pass_methods!([$($passes),*], $methods); } #[allow(rustc::lint_pass_impl_without_macro)] impl LintPass for BuiltinCombinedEarlyLintPass { fn name(&self) -> &'static str { panic!() } }
expand_combined_early_lint_pass_methods
macro_rules! expand_combined_early_lint_pass_methods { ($passes:tt, [$($(#[$attr:meta])* fn $name:ident($($param:ident: $arg:ty),*);)*]) => ( $(fn $name(&mut self, context: &EarlyContext<'_>, $($param: $arg),*) { expand_combined_early_lint_pass_method!($passes, self, $name, (context, $($param),*)); })* ) }
expand_combined_early_lint_pass_methods宏在 BuiltinCombinedEarlyLintPass
中展开所有 early_lint_methods
中定义的方法。 通过这个宏,BuiltinCombinedEarlyLintPass
的定义变为(省略其他定义):
impl EarlyLintPass for BuiltinCombinedEarlyLintPass { fn check_param(&mut self, context: &EarlyContext<'_>, a: &ast::Param) { expand_combined_early_lint_pass_method!($passes, self, $name, (context, $($param),*)); } fn check_ident(&mut self, context: &EarlyContext<'_>, a: &ast::Ident) { expand_combined_early_lint_pass_method!($passes, self, $name, (context, $($param),*)); } fn check_crate(&mut self, context: &EarlyContext<'_>, a: &ast::Crate) { expand_combined_early_lint_pass_method!($passes, self, $name, (context, $($param),*)); } ... }
expand_combined_early_lint_pass_method
macro_rules! expand_combined_early_lint_pass_method { ([$($passes:ident),*], $self: ident, $name: ident, $params:tt) => ({ $($self.$passes.$name $params;)* }) }
expand_combined_early_lint_pass_method:在展开的check_*
函数中调用每一个 LintPass
的 check_*
。 通过这个宏,BuiltinCombinedEarlyLintPass
的定义变为(省略其他定义):
impl EarlyLintPass for BuiltinCombinedEarlyLintPass { fn check_param(&mut self, context: &EarlyContext<'_>, a: &ast::Param) { self.UnusedParens.check_param(context, a); self.UnusedBraces.check_param(context, a); self.UnusedImportBraces.check_param(context, a); ... } fn check_ident(&mut self, context: &EarlyContext<'_>, a: &ast::Ident) { self.UnusedParens.check_ident(context, a); self.UnusedBraces.check_ident(context, a); self.UnusedImportBraces.check_ident(context, a); ... } fn check_crate(&mut self, context: &EarlyContext<'_>, a: &ast::Crate) { self.UnusedParens.check_crate(context, a); self.UnusedBraces.check_crate(context, a); self.UnusedImportBraces.check_crate(context, a); ... } ... }
BuiltinCombinedEarlyLintPass 的最终定义
通过以上宏的展开,BuiltinCombinedEarlyLintPass
的定义实际为如下形式:
pub struct BuiltinCombinedEarlyLintPass { UnusedParens: UnusedParens, UnusedBraces: UnusedBraces, ... } impl BuiltinCombinedEarlyLintPass{ pub fn new() -> Self { UnusedParens: UnusedParens, UnusedBraces: UnusedBraces, ... } pub fn get_lints() -> LintArray { let mut lints = Vec::new(); lints.extend_from_slice(&UnusedParens::get_lints()); lints.extend_from_slice(&UnusedBraces::get_lints()); ... lints } } impl EarlyLintPass for BuiltinCombinedEarlyLintPass { fn check_crates(&mut self, context: &EarlyContext<'_>, a: &ast::Crate){ self.UnusedParens.check_crates (context, a); self.UnusedBraces.check_crates (context, a); ... } fn check_ident(&mut self, context: &EarlyContext<'_>, a: Ident){ self.UnusedParens.check_ident (context, a); self.UnusedBraces.check_ident (context, a); ... } .. }
通过这个定义,可以在遍历 AST 时使用 BuiltinCombinedEarlyLintPass
的 check_*
方法实现多个 lintpass 的检查。
Lint 的进一步优化
基于 CombinedLintPass ,可以对上一篇文章中提出的 Linter 的设计做进一步优化。
这里,可以用 CombinedLintPass 的check_*
方法,在 Visitor 遍历 AST 时执行对应的检查。虽然效果与之前一致,但因为宏的关系,所有的 check_*
方法和需要执行的 lintpass 都被收集到了一个结构中,也更容易管理。同样的,因为 CombinedLintPass 实际上调用的是每个 lintpass 各自的 check 方法,虽然调用起来可能下图一样很复杂,但因为 lintpass 中定义的 check 方法大部分是由宏生成的空检查,所以也不会造成性能上的损失。
总结
本文简单介绍了 Rustc 源码中关于 CombinedLintPass
这一结构的定义和实现 ,并以此进一步优化 Linter 的设计。希望能够对理解 Rustc 及 Lint 有所帮助,如有错误,欢迎指正。后续的文章将继续介绍 Rustc 中 Lint 在编译过程中的注册和执行过程,期待继续关注。
参考链接
- Rust源码剖析(电子书):https://github.com/awesome-kusion/rust-code-book
- KCL 配置策略语言:https://github.com/KusionStack/KCLVM
- KusionStack: https://github.com/KusionStack
- Rustc: https://github.com/rust-lang/rust
- rustc-dev-guide: https://rustc-dev-guide.rust-lang.org/
- Rust Visitor: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_ast/visit/index.html
- Rust Clippy: https://github.com/rust-lang/rust-clippy