网站首页 » » 浅谈TreeShaking在前端中的应用 - Rollup VS Webpack

浅谈TreeShaking在前端中的应用 - Rollup VS Webpack

April 30, 2021

Tree Shaking是一种死码清除(dead code elimination)技术,通常用于在ECMAScript方言比如Dart,JavaScript或者TypeScript打包成一个文件时,移除未使用的代码以此来优化代码。

在动态语言中实现 Dead code elimination 要比在静态语言中实现难的多。treeshaker 这个概念最早在起源于1990年,来自LISP语言。这个技术主要核心点是,将程序中所有可能执行到的流用一个树形结构的函数调用来表示,这样那些从来不会调用的函数就可以清理

在2012年的时候,Google 的 Bob Nystrom 开发的 closure compiler 就实现了这个算法,并应用在 Dart 的 dart2js compiler中。在Dart代码编译成JavaScript的过程中,编译器会做tree shaking。在JavaScript中,有时候就算你只使用了库中的一个函数,你也不得不将整个库引入到项目中,最后输出的文件会包含很多用不上的代码,体积大很多。而tree shaking技术就可以让输出的代码只包含你需要的函数。

2015年的时候 Rollup推出了 Tree Shaking 功能,将这个概念真正带到了前端圈子中,随后 Webpack2 跟进也实现了 Tree Shaking,并在Webpack4中升级改进了方案

TreeShaking 的基本原理
在编译器原理中,死码消除(Dead code elimination)是一种编译最佳化(Optimizing compiler)技术,它的用途是移除对程序执行结果没有任何影响的代码。移除这类代码可以减少程序的大小,避免执行过程中出现程序出现不相关的运算行为。无法执行的代码(unreachable code), 执行结果不会被使用的代码和只会影响死变量(只写不读)的代码都属于 Dead code。

对于JavaScript这种动态语言来说,TreeShaking的实现原本是一件相对较困难的事情。为何从Rollup开始能把这种技术带去前端圈子呢?因为ES6的模块特性。较早提出的CommonJS, AMD等规范都是动态的,模块的导入导出可以在运行时动态的变化。同时对于模块来说都是一个个对象,无论他们导出什么,都可以通过属性的方式来访问,比如:

const a = require(./${file}.js/) // 可以动态加载模块
const { stat, exists, readFile } = require('fs') // 解构导出的对象
var my_lib;
if (Math.random()) {

my_lib = require('foo');

} else {

my_lib = require('bar');

}

if (Math.random()) {

exports.baz = ···;

}
而ES6中的模块特性不同,它实现静态模块结构。在编译时就要确定导入和导出的内容,不允许在运行时发生变化。而正是这静态类型的特性,与 Treeshaking 无比契合。传统编译型的语言中,都是由编译器将 Dead Code 从 AST(抽象语法树)中删除。而在JavaScript中使用 Rollup 和 Webpack 都可以完成了这个任务。

Rollup
在 Rollup 中默认是启用treeshaking的。配置项中的 treeshake 默认为 true。包含其他一些选项,可按需配置。

// src/rollup/typed.d.ts
export interface NormalizedInputOptions {
acorn: Object;
acornInjectPlugins: Function[];
cache: false | undefined | RollupCache;
context: string;
experimentalCacheExpiry: number;
external: IsExternal;
/* @deprecated Use the "inlineDynamicImports" output option instead. /
inlineDynamicImports: boolean | undefined;
input: string[] | { [entryAlias: string]: string };
/* @deprecated Use the "manualChunks" output option instead. /
manualChunks: ManualChunksOption | undefined;
moduleContext: (id: string) => string;
onwarn: WarningHandler;
perf: boolean;
plugins: Plugin[];
preserveEntrySignatures: PreserveEntrySignaturesOption;
/* @deprecated Use the "preserveModules" output option instead. /
preserveModules: boolean | undefined;
preserveSymlinks: boolean;
shimMissingExports: boolean;
strictDeprecations: boolean;
treeshake: false | NormalizedTreeshakingOptions;
}

export interface NormalizedTreeshakingOptions {
annotations: boolean;
moduleSideEffects: HasModuleSideEffects;
propertyReadSideEffects: boolean;
tryCatchDeoptimization: boolean;
unknownGlobalSideEffects: boolean;
}
tresshake这个参数主要影响两个地方:

编译启动阶段Graph执行build方法时,过滤掉相应的Module,为剩余的Module创建AST的上下文
编译过程会将Module中getDependenciesToBeIncluded方法返回的模块用作后续的chunk
在Rollup的源码:src/Graph.ts中有一个includeStatements方法。

src/Graph.ts

export default class Graph {
...
async build(): Promise {

timeStart('generate module graph', 2);
await this.generateModuleGraph();
timeEnd('generate module graph', 2);

timeStart('sort modules', 2);
this.phase = BuildPhase.ANALYSE;
this.sortModules();
timeEnd('sort modules', 2);

timeStart('mark included statements', 2);
this.includeStatements();
timeEnd('mark included statements', 2);

this.phase = BuildPhase.GENERATE;

}
...
private includeStatements() {

for (const module of [...this.entryModules, ...this.implicitEntryModules]) {
  if (module.preserveSignature !== false) {
    module.includeAllExports();
  } else {
    markModuleAndImpureDependenciesAsExecuted(module);
  }
}
if (this.options.treeshake) {
  let treeshakingPass = 1;
  do {
    timeStart(`treeshaking pass ${treeshakingPass}`, 3);
    this.needsTreeshakingPass = false;
    for (const module of this.modules) {
      if (module.isExecuted) module.include();
    }
    timeEnd(`treeshaking pass ${treeshakingPass++}`, 3);
  } while (this.needsTreeshakingPass);
} else {
  for (const module of this.modules) module.includeAllInBundle();
}
for (const externalModule of this.externalModules) externalModule.warnUnusedImports();
for (const module of this.implicitEntryModules) {
  for (const dependant of module.implicitlyLoadedAfter) {
    if (!(dependant.isEntryPoint || dependant.isIncluded())) {
      error(errImplicitDependantIsNotIncluded(dependant));
    }
  }
}

}
...
}
在 if 代码块中的 module.include() 和 module.includeAllInBundle() 做的事情很简单,就是

export default class Module {
...
includeAllInBundle() {
this.ast.include(createInclusionContext(), true);
}
...
include(): void {
const context = createInclusionContext();
if (this.ast.shouldBeIncluded(context)) this.ast.include(context, false);
}
}
在Rollup的源码:src/Module.ts中有一个getDependenciesToBeIncluded方法,这个方法返回最后code split 时需要使用的 module。

src/Module.ts

export default class Module {
...
getDependenciesToBeIncluded(): Set {

if (this.relevantDependencies) return this.relevantDependencies;
const relevantDependencies = new Set<Module | ExternalModule>();
const additionalSideEffectModules = new Set<Module>();
const possibleDependencies = new Set(this.dependencies);
let dependencyVariables = this.imports;
if (this.isEntryPoint || this.includedDynamicImporters.length > 0 || this.namespace.included) {
  dependencyVariables = new Set(dependencyVariables);
  for (const exportName of [...this.getReexports(), ...this.getExports()]) {
    dependencyVariables.add(this.getVariableForExportName(exportName));
  }
}
for (let variable of dependencyVariables) {
  if (variable instanceof SyntheticNamedExportVariable) {
    variable = variable.getBaseVariable();
  } else if (variable instanceof ExportDefaultVariable) {
    const { modules, original } = variable.getOriginalVariableAndDeclarationModules();
    variable = original;
    for (const module of modules) {
      additionalSideEffectModules.add(module);
      possibleDependencies.add(module);
    }
  }
  relevantDependencies.add(variable.module!);
}
if (this.options.treeshake) {
  for (const dependency of possibleDependencies) {
    if (
      !(
        dependency.moduleSideEffects || additionalSideEffectModules.has(dependency as Module)
      ) ||
      relevantDependencies.has(dependency)
    ) {
      continue;
    }
    if (dependency instanceof ExternalModule || dependency.hasEffects()) {
      relevantDependencies.add(dependency);
    } else {
      for (const transitiveDependency of dependency.dependencies) {
        possibleDependencies.add(transitiveDependency);
      }
    }
  }
} else {
  for (const dependency of this.dependencies) {
    relevantDependencies.add(dependency);
  }
}
return (this.relevantDependencies = relevantDependencies);

}
}
Webpack
如果在Webpack想要对代码进行 tree-shaking,需要满足以下几项:

你必须处于生产模式。Webpack 只有在压缩代码的时候会 tree-shaking
必须将优化选项 usedExports 设置为 true。告诉 Webpack 识别出它认为没有被使用的代码,并在最初的打包步骤中给它做标记。
最后使用一个支持删除死代码的压缩器。这种压缩器将识别出 Webpack 是如何标记它认为没有被使用的代码,并将其剥离。TerserPlugin 支持这个功能
下面是 Webpack 开启 tree-shaking 的基本配置:

// Base Webpack Config for Tree Shaking
const config = {
mode: 'production',
optimization: {

usedExports: true,
minimizer: [
  new TerserPlugin({...})
]

}
};
以webpack官网文档中的demo为例,当开始生产环境模式时,打包输出的内容会包含unused harmony export,以此来标记没有被使用的代码

src/index.js

import _ from 'lodash';
import { cube } from './math.js';

function component() {
const element = document.createElement('div');
const element = document.createElement('pre');

// Lodash, now imported by this script
element.innerHTML = _.join(['Hello', 'webpack'], ' ');
element.innerHTML = [

'Hello webpack!',
'5 cubed is equal to ' + cube(5)

].join('\n\n');

return element;
}

document.body.appendChild(component());
dist/bundle.js

/ 1 /
/*/ (function(module, __webpack_exports__, __webpack_require__) {
'use strict';
/ unused harmony export square /
/ harmony export (immutable) / __webpack_exports__['a'] = cube;
function square(x) {

return x * x;

}

function cube(x) {

return x * x * x;

}
});
然后再通过 Terser-Webpack-Plugin 压缩代码,将标记为未使用的代码删除。接下来我们看看这两个过程具体的代码实现。

在源码的 lib/optimize.js中,会先定义一个Set,保存所有暴露出来的未使用的exports

// Set with all root exposed unused exports
/* @type {Set} /
const unusedExports = new Set();
接下来遍历保存在rootModule中的所有HarmonyExportSpecifierDependency依赖,将没有使用的依赖名称保存到 unusedExports

for (const dep of this.rootModule.dependencies) {
if (dep instanceof HarmonyExportSpecifierDependency) {

const used = /** @type {string | false } */ (this.rootModule.getUsedName(
  moduleGraph,
  dep.name
));
if (used) {
  const info = moduleToInfoMap.get(this.rootModule);
  if (!exportsMap.has(used)) {
    exportsMap.set(
      used,
      () => `/* binding */ ${info.internalNames.get(dep.id)}`
    );
  }
} else {
  unusedExports.add(dep.name || "namespace");
}

} else if (dep instanceof HarmonyExportExpressionDependency) {

const used = /** @type {string | false } */ (this.rootModule.getUsedName(
  moduleGraph,
  "default"
));
if (used) {
  const info = moduleToInfoMap.get(this.rootModule);
  if (!exportsMap.has(used)) {
    exportsMap.set(
      used,
      () =>
        `/* default */ ${info.internalNames.get(
          typeof dep.declarationId === "string"
            ? dep.declarationId
            : "__WEBPACK_MODULE_DEFAULT_EXPORT__"
        )}`
    );
  }
} else {
  unusedExports.add("default");
}

} else if (dep instanceof HarmonyExportImportedSpecifierDependency) {

const exportDefs = getHarmonyExportImportedSpecifierDependencyExports(
  dep,
  moduleGraph
);
for (const def of exportDefs) {
  const importedModule = moduleGraph.getModule(dep);
  const info = moduleToInfoMap.get(importedModule);
  const used = /** @type {string | false } */ (this.rootModule.getUsedName(
    moduleGraph,
    def.name
  ));
  if (used) {
    if (!exportsMap.has(used)) {
      exportsMap.set(used, requestShortener => {
        const finalName = getFinalName(
          moduleGraph,
          info,
          def.ids,
          moduleToInfoMap,
          requestShortener,
          runtimeTemplate,
          false,
          false,
          this.rootModule.buildMeta.strictHarmonyModule,
          true
        );
        return `/* reexport */ ${finalName}`;
      });
    }
  } else {
    unusedExports.add(def.name);
  }
}

}
}
在 lib/dependencies/HarmonyExportInitFragment.js 和 lib/dependencies/HarmonyExportExpressionDependency.js 都有打标记的操作

lib/dependencies/HarmonyExportInitFragment.js

/**

  • @param {GenerateContext} generateContext context for generate
  • @returns {string|Source} the source code that will be included as initialization code
    */

getContent({ runtimeTemplate, runtimeRequirements }) {
runtimeRequirements.add(RuntimeGlobals.exports);
runtimeRequirements.add(RuntimeGlobals.definePropertyGetters);

const unusedPart =

this.unusedExports.size > 1
  ? `/* unused harmony exports ${joinIterableWithComma(
    this.unusedExports
  )} */\n`
  : this.unusedExports.size > 0
    ? `/* unused harmony export ${
    this.unusedExports.values().next().value
    } */\n`
    : "";

const definitions = [];
for (const [key, value] of this.exportMap) {

definitions.push(
  `\n/* harmony export */   ${JSON.stringify(
    key
  )}: ${runtimeTemplate.returningFunction(value)}`
);

}
const definePart =

this.exportMap.size > 0
  ? `/* harmony export */ ${RuntimeGlobals.definePropertyGetters}(${
  this.exportsArgument
  }, {${definitions.join(",")}\n/* harmony export */ });\n`
  : "";

return ${definePart}${unusedPart};
}
lib/dependencies/HarmonyExportExpressionDependency.js

HarmonyExportExpressionDependency.Template = class HarmonyExportDependencyTemplate extends NullDependency.Template {
/**

  • @param {Dependency} dependency the dependency for which the template should be applied
  • @param {ReplaceSource} source the current replace source which can be modified
  • @param {DependencyTemplateContext} templateContext the context object
  • @returns {void}
    */

apply(

dependency,
source,
{ module, moduleGraph, runtimeTemplate, runtimeRequirements, initFragments }

) {

const dep = /** @type {HarmonyExportExpressionDependency} */ (dependency);
const used = module.getUsedName(moduleGraph, "default");
const { declarationId } = dep;
const exportsName = module.exportsArgument;
if (declarationId) {
  let name;
  if (typeof declarationId === "string") {
    name = declarationId;
  } else {
    name = "__WEBPACK_DEFAULT_EXPORT__";
    source.replace(
      declarationId.range[0],
      declarationId.range[1] - 1,
      `${declarationId.prefix}${name}${declarationId.suffix}`
    );
  }

  if (used) {
    const map = new Map();
    map.set(used, `/* export default binding */ ${name}`);
    initFragments.push(new HarmonyExportInitFragment(exportsName, map));
  }

  source.replace(
    dep.rangeStatement[0],
    dep.range[0] - 1,
    `/* harmony default export */ ${dep.prefix}`
  );
} else {
  let content;
  if (used) {
    runtimeRequirements.add(RuntimeGlobals.exports);
    if (runtimeTemplate.supportsConst()) {
      const name = "__WEBPACK_DEFAULT_EXPORT__";
      content = `/* harmony default export */ const ${name} = `;
      const map = new Map();
      map.set(used, name);
      initFragments.push(new HarmonyExportInitFragment(exportsName, map));
    } else {
      // This is a little bit incorrect as TDZ is not correct, but we can't use const.
      content = `/* harmony default export */ ${exportsName}[${JSON.stringify(
        used
      )}] = `;
    }
  } else {
    content =
      "/* unused harmony default export */ var _unused_webpack_default_export = ";
  }

  if (dep.range) {
    source.replace(
      dep.rangeStatement[0],
      dep.range[0] - 1,
      content + "(" + dep.prefix
    );
    source.replace(dep.range[1], dep.rangeStatement[1] - 0.5, ");");
    return;
  }

  source.replace(dep.rangeStatement[0], dep.rangeStatement[1] - 1, content);
}

}
};
Terser-Webpack-plugin 和 Terser 还未找到相关剔除代码的标记

总结
虽然还没有完全理清楚Webpack中treeshaking在代码层面的具体逻辑,但是对比可以看到。Rollup采用的是先分析,然后找到需要的代码,最后再打包。而webpack则是先打标记,最后再剔除,比较符合标准的DCE的操作