2024-04-28 从一个简单的字节进制转换来聊聊如何编写 Node.js 包

参考链接：

better-bytes：https://github.com/CaoMeiYouRen/better-bytes

KB / KiB，MB / MiB，GB / GiB，…… 的区别是什么？

Prefixes for binary multiples

前言

最近在研究字节进制转换的问题，然后发现这个看似简单的问题实则埋着天坑，为此进行了一番研究，并且还专门写了一个 Node.js 包( better-bytes)来解决问题。

在此分享一下我写 better-bytes 的心路历程，从中看到开发一个健壮的 Node.js 包有多么不容易。

format 函数的实现

事情的起因

事情的起因还是要从一次代码检查中说起。

在很久以前，写了一段错误的字节进制转换函数。

/**
 * 格式化流量数据
 *
 * @author CaoMeiYouRen
 * @date 2019-07-25
 * @export
 * @param {number} data 单位B
 * @returns {string}
 */
export function formatData(data: number): string {
    const arr = ['B', 'KB', 'MB', 'GB', 'TB', 'PB']
    for (let i = 0; i < arr.length; i++) {
        if (data < 1024) {
            return `${data.toFixed(2)} ${arr[i]}`
        }
        data /= 1024
    }
    return `${data.toFixed(2)} PB`
}

为什么说是错误的呢？因为最后一行写错了单位，应该是 EB。

虽然是个简单的问题，但由于压根没触发过，所以一直没发现。

然后再考虑到，现在还出现了 ZB、YB单位，因此修复后的函数如下：

function formatData(data: number) {
    const units = ['B', 'KB', 'MB', 'GB', 'TB', 'PB', 'EB', 'ZB', 'YB']
    let i = 0
    while (data >= 1024 && i < units.length - 1) {
        data /= 1024
        i++
    }
    return `${data.toFixed(2)} ${units[i]}`
}

看上去十分完美了，但这也是问题的开始

对入参的校验

首先一个最简单的问题是，没有对入参进行校验，如果用户传入了 number 以外类型的参数就会直接报错，因此应当处理这种情况。

1
2
3

if (typeof data !== 'number') {
    throw new Error('Data must be a number')
}

在这里，我选择抛出异常来让用户自己处理好入参再传入。

然后就是，如果传入的参数不是一个有限值(finite)，那么也会出错。例如传入了 NaN、Infinity、-Infinity，显然，也需要处理。

1
2
3

if (!Number.isFinite(data)) {
    throw new Error('Data must be finite')
}

再然后，如果传入的数字小于 0，函数的逻辑也直接错误了，因此也需要处理。

1
2
3

if (data < 0) {
    throw new Error('Data must be greater than or equal to 0')
}

看上去万事大吉了，但还有个小问题，传入的字节数需要是整数，小数点对字节数来说没有意义。

1
2
3

if (!Number.isInteger(data)) {
    throw new Error('Data must be integer')
}

当然了，如果宽松一点的话可以进行 Math.floor 取整，但这里严格起见就抛出异常了。

综上，现在的函数实现如下：

function formatData(data: number) {
    if (typeof data !== 'number') {
        throw new Error('Data must be a number')
    }
    if (!Number.isFinite(data)) {
        throw new Error('Data must be finite')
    }
    if (!Number.isInteger(data)) {
        throw new Error('Data must be integer')
    }
    if (data < 0) {
        throw new Error('Data must be greater than or equal to 0')
    }
    const units = ['B', 'KB', 'MB', 'GB', 'TB', 'PB', 'EB', 'ZB', 'YB']
    let i = 0
    while (data >= 1024 && i < units.length - 1) {
        data /= 1024
        i++
    }
    return `${data.toFixed(2)} ${units[i]}`
}

十分完美！现在的入参只能是非零整数（自然数）了，应该不会出什么问题了吧？

……

真的吗？

1000 还是 1024 ？

如果我说，这个函数从一开始就是错的，你会信吗？

先来看看以下几个字节进制转换，判断下哪几个是正确。

a. 1 KB = 1024 B
b. 1 KB = 1000 B
c. 1 KiB = 1024 B
d. 1 KiB = 1000 B

答案很简单，b 和 c 的情况是对的，而一般认为的 a 其实严格来讲是不对的。

说话要讲证据，因此在这给出规范依据：Prefixes for binary multiples

也可参考知乎上的这个问题：KB / KiB，MB / MiB，GB / GiB，…… 的区别是什么？

此处不再详细论述 KB 和 KiB 的区别，直接给出结论。

按照规范，以下两种转换方式是对的

1 2	1 KB = 1000 B 1 KiB = 1024 B

故这里也按照这个规范实现字节进制转换函数的逻辑。

显然，到这一步，需要添加配置项了，接下来实现这一点。

进制配置项

首先把需要的进制和进制单位都列出来。

const KILO_BINARY_BYTE_UNITS = ['B', 'KiB', 'MiB', 'GiB', 'TiB', 'PiB', 'EiB', 'ZiB', 'YiB']
const KILOBYTE_UNITS = ['B', 'KB', 'MB', 'GB', 'TB', 'PB', 'EB', 'ZB', 'YB']
const KILO_BINARY_BYTE_BASE = 1024
const KILOBYTE_BASE = 1000

这里需要添加一个选项，用于判断按照kilobinary(千位二进制)还是 kilo(千进制)来换算。

/**
 * kilobinary = 2^10 ; kilo = 10^3
 */
type StandardType = 'kilobinary' | 'kilo'

type FormatOptions = {
    standard?: StandardType
}

然后就可以实现以下函数：

function formatData(data: number) {
    if (typeof data !== 'number') {
        throw new Error('Data must be a number')
    }
    if (!Number.isFinite(data)) {
        throw new Error('Data must be finite')
    }
    if (!Number.isInteger(data)) {
        throw new Error('Data must be integer')
    }
    if (data < 0) {
        throw new Error('Data must be greater than or equal to 0')
    }
    const { standard = 'kilobinary' } = options
    const units = standard === 'kilobinary' ? KILO_BINARY_BYTE_UNITS : KILOBYTE_UNITS
    const base = standard === 'kilobinary' ? KILO_BINARY_BYTE_BASE : KILOBYTE_BASE
    let i = 0
    while (data >= base && i < units.length - 1) {
        data /= base
        i++
    }
    return `${data.toFixed(2)} ${units[i]}`
}

十分完美！现在支持 kilobinary(千位二进制)和 kilo(千进制) 两种计算方式了！

细节优化

仔细观察函数。

如果传入的数值小于 1024 ，那么此时得到的就是个精确值，就不应该保留小数了。

// 忽略其他代码
if (i === 0) {
    return `${value}${unitSeparator}${units[i]}`
}
return `${value.toFixed(decimal)}${unitSeparator}${units[i]}`

支持超大数字

实际上这里还有个细节，既然已经支持到了 YiB单位，那么各位有想过 1 YiB 会有多大吗？

1	1 YiB = 1208925819614629174706176 B

1208925819614629174706176 这个数字已经远大于 Number.MAX_SAFE_INTEGER(即 2^53 − 1，9007199254740991 )了，因此，可以考虑尝试用 BigInt 来优化超大数字的运算。

但由于除法天生丢失精度，因此使用 BigInt 实际上没有提高精度，这里只是增加了 BigInt 的计算支持。

字节进制转换毕竟不是计算货币，没必要那么高的精度。

如果真对精度有非常高的要求，可考虑使用 big.js

同时，由于 BigInt 不支持 toFixed，因此也不保留小数了。

综上，最终实现如下：

/**
 * base: 2^10 = 1024
 */
export const KILO_BINARY_BYTE_UNITS = ['B', 'KiB', 'MiB', 'GiB', 'TiB', 'PiB', 'EiB', 'ZiB', 'YiB']

/**
 * base: 10^3 = 1000
 */
export const KILOBYTE_UNITS = ['B', 'KB', 'MB', 'GB', 'TB', 'PB', 'EB', 'ZB', 'YB']

export const KILO_BINARY_BYTE_BASE = 1024
export const KILOBYTE_BASE = 1000

/**
 * kilobinary = 2^10 ; kilo = 10^3
 */
export type StandardType = 'kilobinary' | 'kilo'

export type FormatOptions = {
    /**
     * base. kilobinary = 2^10 ; kilo = 10^3. Default: kilobinary
     */
    standard?: StandardType
    /**
     * Maximum number of decimal places to include in output. Default: 2.
     */
    decimal?: number
    /**
     * Separator to use between number and unit. Default: ' '.
    */
    unitSeparator?: string
}

/**
 * Format the given value in bytes into a string.
 * 将给定的字节值格式化为字符串。
 *
 * @author CaoMeiYouRen
 * @date 2024-04-27
 * @export
 * @param data integer/bigint
 * @param [options={}]
 */
export function format(data: number | bigint, options: FormatOptions = {}): string {
    if (typeof data !== 'number' && typeof data !== 'bigint') {
        throw new Error('Data must be a number or bigint')
    }
    if (typeof data === 'number') {
        if (!Number.isFinite(data)) { // +Infinity/-Infinity/NaN
            throw new Error('Data must be finite')
        }
        if (!Number.isInteger(data)) {
            throw new Error('Data must be integer')
        }
    }
    if (data < 0) {
        throw new Error('Data must be greater than or equal to 0')
    }

    const { standard = 'kilobinary', decimal = 2, unitSeparator = ' ' } = options
    const units = standard === 'kilobinary' ? KILO_BINARY_BYTE_UNITS : KILOBYTE_UNITS
    const base = standard === 'kilobinary' ? KILO_BINARY_BYTE_BASE : KILOBYTE_BASE
    let i = 0
    let value: number | bigint = data

    while (value >= base && i < units.length - 1) {
        if (typeof value === 'number' && value > Number.MAX_SAFE_INTEGER) {
            value = BigInt(Math.floor(value))
        } else if (typeof value === 'bigint' && value <= BigInt(Number.MAX_SAFE_INTEGER)) {
            value = Number(value)
        }
        if (typeof value === 'bigint') {
            value /= BigInt(base)
        } else {
            value /= base
        }
        i++
    }

    if (i === 0 || typeof value === 'bigint') {
        return `${value}${unitSeparator}${units[i]}`
    }
    return `${value.toFixed(decimal)}${unitSeparator}${units[i]}`
}

以 Number.MAX_SAFE_INTEGER 为界进行 Number 和 BigInt 的转换，当大于 Number.MAX_SAFE_INTEGER 的时候转换为 BigInt 以支持大数字运算，当小于 Number.MAX_SAFE_INTEGER 的时候转换为 Number 以支持小数位运算。

parse 函数的实现

解析的需求

格式化的函数已经实现了，那么字节进制转换到这一步就完工了吗？

并没有！

有格式化函数那么就必然有解析函数！

对用户而言，输入一个以字节为单位的 1073741824 可比 1GiB 要麻烦多了。

那么就只能苦一苦程序猿，由程序猿编写代码来解析出正确的字节数了。

通过对 npm 包的搜索，大部分字节进制转换相关的包都很少实现解析函数，而其中实现了解析函数的 bytes.js 又存在不支持 kilobinary(千位二进制) 的问题，因此专门写了 better-bytes 这个包

解析的实现

相较于格式化，解析就是天坑了！

从 string 转换为 number/bigint可比从 number/bigint 转换为 string 困难的多！

在这里参考了 bytes.js 中的有关实现，也采用了正则表达式来解析。

const PARSE_REG_EXP = /^((\+)?(\d+(?:\.\d+)?)) *((k|m|g|t|p|e|z|y)?i?b)$/i

function parse(data: string): number | null {
    if (typeof data !== 'string') {
        throw new Error('Data must be a string')
    }   
    const results = PARSE_REG_EXP.exec(data)
    const unit = results?.[4]?.toLowerCase() || 'b'
    const floatValue = Number.parseFloat(results?.[1] || data)
    if (!Number.isFinite(floatValue) || floatValue < 0) {
        return null
    }  
    let i = 0
    i = KILO_BINARY_BYTE_UNITS.findIndex((e) => e.toLowerCase() === unit)
    if (i >= 0) {
        return floatValue * KILO_BINARY_BYTE_BASE ** i
    } else {
        i = KILOBYTE_UNITS.findIndex((e) => e.toLowerCase() === unit)
        if (i >= 0) {
            return floatValue * KILOBYTE_BASE ** i
        }
    }
    return null
}

相较于格式化，解析时会遇到更多解析失败的情况，一般而言可以选择抛出异常或者返回 null 这两种方式之一来告知程序猿出现了异常。

在这里选择了返回 null 来抛出异常，原因是我认为对入参的校验可以严格一些，可以抛出异常，但对于解析而言，未必是逻辑问题，可能只是用户输入的问题，因此选择让调用者手动处理 null 的问题。

由于 units 都是按照顺序排的，因此进制单位在数组中的下标刚好就是进制的次方数，所以可以直接使用 1024 ** i 的方式来计算结果。

同样的，如果解析出来的结果不是一个有限值(finite)，或者小于 0，则返回 null。

保留整数位

上述实现有一个问题，正如前往格式化函数实现中说的一样，小数位对字节数来说没有意义，因此只保留整数位。

1	result = Math.floor(floatValue * 1024 ** i)

支持无单位的情况

由于 bytes.js 中将无单位的数值视为以 字节 为单位，因此此处沿用了这个逻辑。但也有改动，比如传入错误的单位会返回 null。

const PARSE_REG_EXP = /^((\+)?(\d+(?:\.\d+)?)) *((k|m|g|t|p|e|z|y)?i?b)$/i
const NUMBER_REG = /^(\+)?\d+(\.\d+)?$/
export function parse(data: string): number | null {
    if (typeof data !== 'string') {
        throw new Error('Data must be a string')
    }

    const { forceKilobinary = false } = options
    const results = PARSE_REG_EXP.exec(data)
    const unit = results?.[4]?.toLowerCase()
    const floatValue = !unit && NUMBER_REG.test(data) ? Number.parseFloat(data) : Number.parseFloat(results?.[1])
    if (!Number.isFinite(floatValue) || floatValue < 0) {
        return null
    }
    if (!unit) {
        return Math.floor(floatValue)
    }
    // 忽略其他代码
}

在字节进制转换的解析函数中，一般而言，是有单位的入参更多，因此只在 unit 为空的情况下才校验入参是否是数字的字符串格式。

支持强制按 1024 进制解析

虽然，按照规范而言，1 KB = 1000 B，1 KiB = 1024 B，但依旧会有很多用户认为 1 KB = 1024 B，也包括很多程序猿。

那么，就需要一个配置项，让 1 KB 视为 1024 B。

type ParseOptions = {
    /**
     * If true, consider kilo as kilobinary, i.e. using 2^10 base
     */
    forceKilobinary?: boolean
}

function parse(data: string, options: ParseOptions = {}): number | null {
    if (typeof data !== 'string') {
        throw new Error('Data must be a string')
    }

    const { forceKilobinary = false } = options
    const results = PARSE_REG_EXP.exec(data)
    const unit = results?.[4]?.toLowerCase()
    const floatValue = !unit && NUMBER_REG.test(data) ? Number.parseFloat(data) : Number.parseFloat(results?.[1])
    if (!Number.isFinite(floatValue) || floatValue < 0) {
        return null
    }
    if (!unit) {
        return Math.floor(floatValue)
    }
    let i = 0
    let standard: StandardType
    i = KILO_BINARY_BYTE_UNITS.findIndex((e) => e.toLowerCase() === unit)
    if (i >= 0) {
        standard = 'kilobinary'
    } else {
        i = KILOBYTE_UNITS.findIndex((e) => e.toLowerCase() === unit)
        if (i >= 0) {
            standard = 'kilo'
        }
    }
    if (!standard) {
        return null
    }
    if (forceKilobinary) {
        standard = 'kilobinary'
    }
    const base = standard === 'kilobinary' ? KILO_BINARY_BYTE_BASE : KILOBYTE_BASE
    const result: number = floatValue * base ** i
    return Math.floor(result)
}

现在，只需要这样调用parse('1KB', { forceKilobinary: true })，就也能解析出 1024 来了。

万事大吉！

支持超大数字

跟 format 函数一样，接下来尝试支持 BigInt 。

由于 8 PiB = 9007199254740991 B，已经超过了 Number.MAX_SAFE_INTEGER 的范围，因此，如果要解析的值是个超大数字，就要考虑转换为 BigInt 来优化大数字乘法精度。

综上，最终实现如下：

export const PARSE_REG_EXP = /^((\+)?(\d+(?:\.\d+)?)) *((k|m|g|t|p|e|z|y)?i?b)$/i
export const NUMBER_REG = /^(\+)?\d+(\.\d+)?$/

export type ParseOptions = {
    /**
     * If true, consider kilo as kilobinary, i.e. using 2 ^ 10 base
     */
    forceKilobinary?: boolean
}

/**
 * Parse the string value into an integer in bytes.
 * If no unit is given, it is assumed the value is in bytes.
 * 将字符串值解析为以字节为单位的整数。
 * 如果没有给出单位，则假定该值以字节为单位。
 * @author CaoMeiYouRen
 * @date 2024-04-27
 * @export
 * @param data
 */
export function parse(data: string, options: ParseOptions = {}): number | bigint | null {
    if (typeof data !== 'string') {
        throw new Error('Data must be a string')
    }

    const { forceKilobinary = false } = options
    const results = PARSE_REG_EXP.exec(data)
    const unit = results?.[4]?.toLowerCase()
    const floatValue = !unit && NUMBER_REG.test(data) ? Number.parseFloat(data) : Number.parseFloat(results?.[1])
    if (!Number.isFinite(floatValue) || floatValue < 0) {
        return null
    }
    if (!unit) {
        return Math.floor(floatValue)
    }
    let i = 0
    let standard: StandardType
    i = KILO_BINARY_BYTE_UNITS.findIndex((e) => e.toLowerCase() === unit)
    if (i >= 0) {
        standard = 'kilobinary'
    } else {
        i = KILOBYTE_UNITS.findIndex((e) => e.toLowerCase() === unit)
        if (i >= 0) {
            standard = 'kilo'
        }
    }
    if (!standard) {
        return null
    }
    if (forceKilobinary) {
        standard = 'kilobinary'
    }
    const base = standard === 'kilobinary' ? KILO_BINARY_BYTE_BASE : KILOBYTE_BASE
    let result: number | bigint = floatValue
    for (let j = 0; j < i; j++) {
        const nextResult: number | bigint = typeof result === 'bigint' ? result * BigInt(base) : result * base
        if (typeof result === 'number' && nextResult > Number.MAX_SAFE_INTEGER) {
            result = BigInt(Math.floor(result)) * BigInt(base)
        } else {
            result = nextResult
        }
    }
    if (typeof result === 'number') {
        result = Math.floor(result)
    }
    return result
}

到这一步，格式化和解析函数都已编写完毕！

大功告成！

添加 jest 测试

编写 jest 测试用例

但还有一步，那就是添加测试用例。

很显然，作为一个基础工具类包，还是纯函数，没有严格的测试用例，是无法让人信服的。

但写测试用例也是件麻烦事，所以可以考虑借助 AI 来生成测试用例。

完整的测试用例如下（主要是 Chat LangChain 生成的，有修改）：

/* eslint-disable no-loss-of-precision, @typescript-eslint/no-loss-of-precision */
import { format, parse } from '../src'

describe('format', () => {
    it('should format bytes correctly for numbers', () => {
        expect(format(0)).toBe('0 B')
        expect(format(1023)).toBe('1023 B')
        expect(format(1024)).toBe('1.00 KiB')
        expect(format(1048576)).toBe('1.00 MiB')
        expect(format(1073741824)).toBe('1.00 GiB')
        expect(format(1099511627776)).toBe('1.00 TiB')
        expect(format(1125899906842624)).toBe('1.00 PiB')
        expect(format(1152921504606846976)).toBe('1.00 EiB')
        expect(format(1180591620717411303424)).toBe('1.00 ZiB')
        expect(format(1208925819614629174706176)).toBe('1.00 YiB')
    })

    it('should format bytes correctly for bigints', () => {
        expect(format(0n)).toBe('0 B')
        expect(format(1023n)).toBe('1023 B')
        expect(format(1024n)).toBe('1.00 KiB')
        expect(format(1048576n)).toBe('1.00 MiB')
        expect(format(1073741824n)).toBe('1.00 GiB')
        expect(format(1099511627776n)).toBe('1.00 TiB')
        expect(format(1125899906842624n)).toBe('1.00 PiB')
        expect(format(1152921504606846976n)).toBe('1.00 EiB')
        expect(format(1180591620717411303424n)).toBe('1.00 ZiB')
        expect(format(1208925819614629174706176n)).toBe('1.00 YiB')
    })

    it('should handle decimal values for numbers', () => {
        expect(format(512)).toBe('512 B')
        expect(format(2048)).toBe('2.00 KiB')
        expect(format(3072)).toBe('3.00 KiB')
        expect(format(1572864)).toBe('1.50 MiB')
    })

    it('should handle large values', () => {
        expect(format(9223372036854775807)).toBe('8.00 EiB')
        expect(format(18446744073709551615)).toBe('16.00 EiB')
        expect(format(18446744073709551616n)).toBe('16.00 EiB')
        expect(format(36893488147419103232n)).toBe('32.00 EiB')

        expect(format(1152921504606846976.1)).toBe('1.00 EiB')
        expect(format(1180591620717411303424.1)).toBe('1.00 ZiB')
        expect(format(1208925819614629174706176.1)).toBe('1.00 YiB')

        expect(format(10889035741470028412976348208558233354240n)).toBe('9007199254740990 YiB')
        expect(format(9007199254740990000000000000000000000000n, { standard: 'kilo' })).toBe('9007199254740990 YB')

        expect(format(10889035741470029621902167823187408060416n)).toBe('9007199254740991 YiB')
        expect(format(9007199254740991000000000000000000000000n, { standard: 'kilo' })).toBe('9007199254740991 YB')

        expect(format(10889035741470030830827987437816582766592n)).toBe('9007199254740992 YiB')
        expect(format(9007199254740992000000000000000000000000n, { standard: 'kilo' })).toBe('9007199254740992 YB')

        expect(format(10633823966279325802638835764831453184n)).toBe('8796093022208.00 YiB')
        expect(format(9007199254740991000000000000000000000n, { standard: 'kilo' })).toBe('9007199254740.99 YB')

    })

    it('should format bytes correctly with kilobinary standard', () => {
        expect(format(1024, { standard: 'kilobinary' })).toBe('1.00 KiB')
        expect(format(1572864, { standard: 'kilobinary' })).toBe('1.50 MiB')
        expect(format(1073741824, { standard: 'kilobinary' })).toBe('1.00 GiB')
    })

    it('should format bytes correctly with kilo standard', () => {
        expect(format(1000, { standard: 'kilo' })).toBe('1.00 KB')
        expect(format(1500000, { standard: 'kilo' })).toBe('1.50 MB')
        expect(format(1000000000, { standard: 'kilo' })).toBe('1.00 GB')
    })

    it('should format bytes correctly with custom decimal places', () => {
        expect(format(1234, { decimal: 3 })).toBe('1.205 KiB')
        expect(format(1234567, { decimal: 0 })).toBe('1 MiB')
    })

    it('should format bytes correctly with custom decimal places for kilobinary standard', () => {
        expect(format(1234, { decimal: 3, standard: 'kilobinary' })).toBe('1.205 KiB')
        expect(format(1234567, { decimal: 0, standard: 'kilobinary' })).toBe('1 MiB')
        expect(format(1234567890, { decimal: 4, standard: 'kilobinary' })).toBe('1.1498 GiB')
    })

    it('should format bytes correctly with custom decimal places for kilo standard', () => {
        expect(format(1234, { decimal: 3, standard: 'kilo' })).toBe('1.234 KB')
        expect(format(1234567, { decimal: 0, standard: 'kilo' })).toBe('1 MB')
        expect(format(1234567890, { decimal: 4, standard: 'kilo' })).toBe('1.2346 GB')
    })

    it('should not exceed the maximum number of decimal places', () => {
        expect(format(1234, { decimal: 10, standard: 'kilobinary' })).toBe('1.2050781250 KiB')
        expect(format(1234, { decimal: 10, standard: 'kilo' })).toBe('1.2340000000 KB')
    })

    it('should use the default space separator', () => {
        expect(format(1024)).toBe('1.00 KiB')
        expect(format(1000, { standard: 'kilo' })).toBe('1.00 KB')
    })

    it('should use the provided separator', () => {
        expect(format(1024, { unitSeparator: '_' })).toBe('1.00_KiB')
        expect(format(1000, { standard: 'kilo', unitSeparator: '-' })).toBe('1.00-KB')
    })

    it('should handle empty separator', () => {
        expect(format(1024, { unitSeparator: '' })).toBe('1.00KiB')
        expect(format(1000, { standard: 'kilo', unitSeparator: '' })).toBe('1.00KB')
    })

    it('should throw an error when data is negative', () => {
        expect(() => format(-1)).toThrow('Data must be greater than or equal to 0')
        expect(() => format(-100)).toThrow('Data must be greater than or equal to 0')
        expect(() => format(-1024)).toThrow('Data must be greater than or equal to 0')
        expect(() => format(-1n)).toThrow('Data must be greater than or equal to 0')
        expect(() => format(-100n)).toThrow('Data must be greater than or equal to 0')
        expect(() => format(-1024n)).toThrow('Data must be greater than or equal to 0')
    })

    it('should throw an error if data is not a number or bigint', () => {
        expect(() => format('invalid' as any)).toThrow('Data must be a number or bigint')
        expect(() => format(true as any)).toThrow('Data must be a number or bigint')
        expect(() => format({} as any)).toThrow('Data must be a number or bigint')
    })

    it('throws error for non-finite numbers', () => {
        expect(() => format(Infinity)).toThrow('Data must be finite')
        expect(() => format(-Infinity)).toThrow('Data must be finite')
        expect(() => format(NaN)).toThrow('Data must be finite')
    })

    it('should throw an error on non integers', () => {
        expect(() => format(1.1)).toThrow('Data must be integer')
        expect(() => format(-1.1)).toThrow('Data must be integer')
    })
})

describe('parse', () => {
    it('should parse valid binary units', () => {
        expect(parse('1B')).toBe(1)
        expect(parse('1KiB')).toBe(1024)
        expect(parse('1MiB')).toBe(1048576)
        expect(parse('1GiB')).toBe(1073741824)
        expect(parse('1TiB')).toBe(1099511627776)
        expect(parse('1PiB')).toBe(1125899906842624)
        expect(parse('1EiB')).toBe(1152921504606846976n)
        expect(parse('1ZiB')).toBe(1180591620717411303424n)
        expect(parse('1YiB')).toBe(1208925819614629174706176n)
    })

    it('should parse valid decimal units', () => {
        expect(parse('1B')).toBe(1)
        expect(parse('1KB')).toBe(1000)
        expect(parse('1MB')).toBe(1000000)
        expect(parse('1GB')).toBe(1000000000)
        expect(parse('1TB')).toBe(1000000000000)
        expect(parse('1PB')).toBe(1000000000000000)
        expect(parse('1EB')).toBe(1000000000000000000n)
        expect(parse('1ZB')).toBe(1000000000000000000000n)
        expect(parse('1YB')).toBe(1000000000000000000000000n)
    })

    it('should parse with forceKilobinary option', () => {
        expect(parse('1KB', { forceKilobinary: true })).toBe(1024)
        expect(parse('1MB', { forceKilobinary: true })).toBe(1048576)
        expect(parse('1GB', { forceKilobinary: true })).toBe(1073741824)
    })

    it('should parse decimal values', () => {
        expect(parse('1.5KiB')).toBe(1536)
        expect(parse('1.5KB')).toBe(1500)
    })

    it('should handle case insensitive units', () => {
        expect(parse('1kib')).toBe(1024)
        expect(parse('1KB')).toBe(1000)
    })

    it('should handle no units', () => {
        expect(parse('1024')).toBe(1024)
        expect(parse('1000')).toBe(1000)
        expect(parse('1000.5')).toBe(1000)
    })

    it('should return an integer', () => {
        expect(parse('1.5B')).toBe(1)
        expect(parse('1023.5B')).toBe(1023)
    })

    it('should handle positive values', () => {
        expect(parse('+1KiB')).toBe(1024)
    })

    it('should return null for invalid input', () => {
        expect(parse('invalid')).toBeNull()
        expect(parse('1XB')).toBeNull()
        expect(parse('-1KiB')).toBeNull()
        expect(parse('Infinity')).toBeNull()
        expect(parse('-Infinity')).toBeNull()
        expect(parse('-1')).toBeNull()
    })

    it('should throw error for non-string input', () => {
        expect(() => parse(123 as unknown as string)).toThrow('Data must be a string')
    })

    it('should return bigint for values greater than MAX_SAFE_INTEGER', () => {
        expect(parse('9007199254740992 YiB')).toBe(10889035741470030830827987437816582766592n)
        expect(parse('9007199254740992 YB')).toBe(9007199254740992000000000000000000000000n)
    })

    it('should return number for values within MAX_SAFE_INTEGER', () => {
        expect(parse('9007199254740991 ZiB')).toBe(10633823966279325802638835764831453184n)
        expect(parse('9007199254740991 ZB')).toBe(9007199254740991000000000000000000000n)
    })

    it('should handle values with many decimal places', () => {
        expect(parse('1.123456789012345 GiB')).toBe(1206302541)
        expect(parse('1.123456789012345 GB')).toBe(1123456789)

        expect(parse('1.123456789012345 PiB')).toBe(1264899894090712)
        expect(parse('1.123456789012345 PB')).toBe(1123456789012345)

        expect(parse('1.123456789012345 ZiB')).toBe(1326343671346062426112n)
        expect(parse('1.123456789012345 ZB')).toBe(1123456789012345000000n)

        expect(parse('1.123456789012345 YiB')).toBe(1358175919458367924338688n)
        expect(parse('1.123456789012345 YB')).toBe(1123456789012345000000000n)
    })

    it('should handle values with many decimal places and return bigint when necessary', () => {
        const valueWithManyDecimals = '9007199254740991.123456789012345'
        const expectedBinaryResult = 10889035741470029621902167823187408060416n
        const expectedDecimalResult = 9007199254740991000000000000000000000000n

        expect(parse(`${valueWithManyDecimals}YiB`)).toBe(expectedBinaryResult)
        expect(parse(`${valueWithManyDecimals}YB`)).toBe(expectedDecimalResult)
    })
})

describe('parse format result', () => {
    test('format and parse should be inverse operations', () => {
        const testCases = [
            0,
            1,
            1024,
            1572864,
            1073741824,
            1125899906842624,
            9007199254740992n,
            10889035741470029621902167823187408060416n,
        ]

        for (const data of testCases) {
            const formattedString = format(data)
            const parsedValue = parse(formattedString)

            expect(parsedValue).toEqual(data)
        }
    })

    test('format and parse with options', () => {
        const testCases = [
            { data: 1572864, options: { standard: 'kilobinary' } },
            { data: 1500000, options: { standard: 'kilo' } },
            { data: 123456, options: { decimal: 3, standard: 'kilobinary' } },
            { data: 123456, options: { decimal: 3, standard: 'kilo' } },
            { data: '1KB', options: { forceKilobinary: true } },
            { data: '1GB', options: { forceKilobinary: true } },
        ]

        for (const { data, options } of testCases) {
            const formattedString = format(typeof data === 'string' ? parse(data, options) as any : data as any, options as any)
            const parsedValue = parse(formattedString, options)

            expect(parsedValue).toEqual(typeof data === 'string' ? parse(data, options) : data)
        }
    })

    test('format and parse precision error should be within 0.5%', () => {
        const testCases = [
            1234567,
            12345678,
            123456789,
            1234567890,
            12345678901,
            123456789012,
            1234567890123,
            12345678901234,
            123456789012345,
            1234567890123456,
            12345678901234567n,
            123456789012345678n,
            1234567890123456789n,
        ]

        for (const data of testCases) {
            const formattedString = format(data)
            const parsedValue = parse(formattedString) as (number | bigint)
            const error = Math.abs((Number(parsedValue) - Number(data)) * 100 / Number(data))
            expect(error).toBeLessThan(0.5)
        }
    })

    test('format and parse precision should improve with more decimal places', () => {
        const testCases = [
            123456789,
            1234567890,
            12345678901,
            123456789012,
            1234567890123,
            12345678901234,
            123456789012345,
            1234567890123456,
            12345678901234567n,
            123456789012345678n,
            1234567890123456789n,
            12345678901234567890n,
            123456789012345678901n,
            1234567890123456789012n,
            12345678901234567890123n,
            123456789012345678901234n,
            1234567890123456789012345n,
        ]

        for (const data of testCases) {
            const formattedString = format(data, { decimal: 10 })
            const parsedValue = parse(formattedString)
            const error = Math.abs((Number(parsedValue) - Number(data)) * 100 / Number(data))
            expect(error).toBeLessThan(1e-8)
        }
    })
})

测试结果如下，通过了全部的测试用例：

PASS  test/index.test.ts                                                                                                                         
  format
    √ should format bytes correctly for numbers (4 ms)                                                                                            
    √ should format bytes correctly for bigints (1 ms)                                                                                            
    √ should handle decimal values for numbers                                                                                                    
    √ should handle large values (1 ms)                                                                                                           
    √ should format bytes correctly with kilobinary standard                                                                                      
    √ should format bytes correctly with kilo standard (1 ms)                                                                                     
    √ should format bytes correctly with custom decimal places (1 ms)                                                                             
    √ should format bytes correctly with custom decimal places for kilobinary standard                                                            
    √ should format bytes correctly with custom decimal places for kilo standard
    √ should not exceed the maximum number of decimal places                                                                                      
    √ should use the default space separator                                                                                                      
    √ should use the provided separator (1 ms)                                                                                                    
    √ should handle empty separator                                                                                                               
    √ should throw an error when data is negative (17 ms)                                                                                         
    √ should throw an error if data is not a number or bigint (1 ms)                                                                              
    √ throws error for non-finite numbers (2 ms)                                                                                                  
    √ should throw an error on non integers (1 ms)                                                                                                
  parse                                                                                                                                           
    √ should parse valid binary units (1 ms)                                                                                                      
    √ should parse valid decimal units (1 ms)                                                                                                     
    √ should parse with forceKilobinary option                                                                                                    
    √ should parse decimal values                                                                                                                 
    √ should handle case insensitive units (1 ms)                                                                                                 
    √ should handle no units (1 ms)                                                                                                               
    √ should return an integer                                                                                                                    
    √ should handle positive values (1 ms)                                                                                                        
    √ should return null for invalid input (1 ms)                                                                                                 
    √ should throw error for non-string input                                                                                                     
    √ should return bigint for values greater than MAX_SAFE_INTEGER                                                                               
    √ should return number for values within MAX_SAFE_INTEGER (1 ms)                                                                              
    √ should handle values with many decimal places (3 ms)                                                                                        
    √ should handle values with many decimal places and return bigint when necessary                                                              
  parse format result                                                                                                                             
    √ format and parse should be inverse operations (1 ms)                                                                                        
    √ format and parse with options (1 ms)                                                                                                        
    √ format and parse precision error should be within 0.5% (1 ms)                                                                               
    √ format and parse precision should improve with more decimal places (2 ms)

生成覆盖率报告

写测试用例就是要尽可能的覆盖到更多的情况，可以用覆盖率来衡量。

在完成 jest 的相关配置后，通过以下命令来生成覆盖率报告。

1	jest --coverage

File	% Stmts	% Branch	% Funcs	% Lines	Uncovered Line #s
All files	98.57	97.05	100	98.5
index.ts	98.57	97.05	100	98.5	105

覆盖率高达 98.57% ，可以说是非常的全面了。

发布项目

完成了编写代码后，接下来就是考虑发布了。

作为一个 Node.js 项目，源码一般是发布到 GitHub，而包则发布到 npm。

发布到 GitHub

发布到 GitHub 倒也简单，创建仓库后使用以下命令

1 2	git remote add origin git@github.com:xxx/yyy.git git push origin master

上传到 GitHub 即可。

发布到 npm

发布到 npm 则使用以下命令

1	npm publish

记得修改版本号

如果你设置过 npm registry 的话，记得把 npm registry 改成 npm 官方的 https://registry.npmjs.org

使用 semantic-release 自动 release

参考：2020-02-18 自定义 changelog 及自动化 github-release

2020-12-08 Github Actions 使用

上传测试覆盖率报告

之前生成了 jest 覆盖率报告，那么自然也要上传以让其他人看到。

参考：Measuring Typescript Code Coverage with Jest and GitHub Actions

即可将 jest 的覆盖率报告上传到 Codecov。

better-bytes 的覆盖率报告如下：

可以说是非常不错了。

以上，一个 Node.js 包的编写到发布的全流程就是这样了。

总结

这篇文章详细介绍了如何从零开始编写一个健壮的 Node.js 包 better-bytes，用于字节进制转换。

该过程体现了编写健壮、高质量开源 Node.js 包所需的各环节,并分享了最佳实践,如入参校验、测试覆盖率、自动化发布等,对提高代码质量和开发效率很有借鉴意义。

【总结由 Chat LangChain 生成】

本文作者：草梅友仁
本文地址： https://blog.cmyr.ltd/archives/1d4ed065.html
版权声明：转载请注明出处！

草梅友仁的博客

2024-04-28 从一个简单的字节进制转换来聊聊如何编写 Node.js 包

2024-04-28 从一个简单的字节进制转换来聊聊如何编写 Node.js 包

前言

format 函数的实现

事情的起因

对入参的校验

1000 还是 1024 ？

进制配置项

更多配置项

细节优化

支持超大数字

parse 函数的实现

解析的需求

解析的实现

保留整数位

支持无单位的情况

支持强制按 1024 进制解析

支持超大数字

添加 jest 测试

编写 jest 测试用例

生成覆盖率报告

发布项目

发布到 GitHub

发布到 npm

使用 semantic-release 自动 release

上传测试覆盖率报告

总结