A powerful configuration conversion tool based on protobuf.


License
MIT
Install
go get github.com/Wenchy/tableau

Documentation

💡 Future development will be at https://github.com/tableauio/tableau.

Tableau

A powerful configuration conversion tool based on Protobuf(proto3).

Features

  • Convert xlsx to JSON, JSON is the first-class citizen of exporting targets.
  • Use protobuf as the IDL(Interface Description Language) to define the structure of xlsx.
  • Use golang to develop the conversion engine.
  • Support multiple programming languages, thanks to protobuf.

Concepts

  • Importer: xlsx/xml importer.
  • IR: Intermediate Representation.
  • Filter: filter the IR.
  • Exporter: JSON, protowire, prototext.
  • Protoconf: a configuration metadata format based on protobuf.

Workflow

xlsx -> Importer -> Protoconf -> Exporter -> JSON/protowire/prototext

Types

  • Scalar
  • Message(struct)
  • List
  • Map(unordered)
  • Timestamp
  • Duration

TODO

protoc plugins

  • Golang
  • C++
  • C#/.NET
  • Python
  • Lua
  • Javascript/Typescript/Node
  • Java

Metadata

  • metatable: a message to describe the worksheet's metadata.
  • metafield: a message to describe the caption's metadata.
  • captrow: caption row, the exact row number of captions at worksheet. Newline in caption is allowed for more readability, and will be trimmed in conversion.
  • descrow: description row, the exact row number of descriptions at worksheet.
  • datarow: data row, the start row number of data.

Newline(line break) in major operating systems:

OS Abbreviation Escape sequence
Unix (linux, OS X) LF \n
Microsoft Windows CRLF \r\n
classic Mac OS/OS X CR \r

LF: Line Feed, CR: Carriage Return.

Mac OS X

Generator

  • generate protoconf by xlsx(header): xlsx -> protoconf
  • generate xlsx(header) by protoconf: protoconf -> xlsx

Conversion

  • xlsx -> JSON(default format and human readable)
  • xlsx -> protowire(small size)
  • xlsx -> prototext(human debugging)
  • JSON -> xlsx
  • protowire -> xlsx
  • prototext -> xlsx

Pretty Print

  • Multiline: every textual element on a new line
  • Indent: 4 space characters
  • JSON support
  • prototext support

EmitUnpopulated

  • JSON: EmitUnpopulated specifies whether to emit unpopulated fields.

Scalar Types

  • interger: int32, uint32, int64 and uint64
  • float: float and double
  • bool
  • string
  • bytes
  • datetime, date, time, duration

Enumerations

  • enum: The Parser accepts three enum value forms:
    • enum value number
    • enum value name
    • enum value alias name (with EnumValueOptions specified)
  • enum: validate the enum value.

Composite Types

  • message: horizontal(row direction) layout, fields located in cells.
  • message: simple in-cell message, each field must be scalar type. It is a comma-separated list of fields. E.g.: 1,test,3.0. List's size need not to be equal to fields' size, as fields will be filled in order. Fields not configured will be filled default values due to its scalar type.
  • list: horizontal(row direction) layout, which is list's default layout, and each item can be message or scalar.
  • list: vertical(column direction) layout. and each item should be message.
  • list: simple in-cell list, element must be scalar type. It is a comma-separated list of elements. E.g.: 1,2,3.
  • list: keyed list, auto aggregate the struct with the same key field.
  • list: scalable or dynamic list size.
  • list: smart recognition of empty element at any position.
  • map: horizontal(row direction) layout.
  • map: vertical(column direction) layout, and is map's default layout.
  • map: unordered map or hash map.
  • map: ordered map.
  • map: simple in-cell map, both key and value must be scalar type. It is a comma-separated list of key:value pairs. E.g.: 1:10,2:20,3:30.
  • map: scalable or dynamic map size.
  • map: smart recognition of empty value at any position.
  • nesting: unlimited nesting of message, list, and map.
  • nesting: the composite type's first element can be composite type.

Default Values

Each scalar type's default value is same as protobuf.

  • interger: 0
  • float: 0.0
  • bool: false
  • string: ""
  • bytes: ""
  • in-cell message: each field's default value is same as protobuf
  • in-cell list: element's default value is same as protobuf
  • in-cell map: both key and value's default value are same as protobuf
  • message: all fields have default values

Empty

  • scalar: default value same as protobuf.
  • message: empty message will not be spawned if all fields are empty.
  • list: empty list will not be spawned if list's size is 0.
  • list: empty message will not be appended if list's element(message type) is empty.
  • map: empty map will not be spawned if map's size is 0.
  • map: empty message will not be inserted if map's value(message type) is empty.
  • nesting: recursively empty.

Merge

  • merge multiple workbooks
  • merge multiple worksheets

Workbook meta

workbook meta sheet @TABLEAU:

  • specify which sheets to be parsed
  • specify parser options for each sheet
Sheet Alias Nameline Typeline
Sheet1 ExchangeInfo 2 2

Datetime

Understanding about RFC 3339 for Datetime and Timezone Formatting in Software Engineering

# This is acceptable in ISO 8601 and RFC 3339 (with T)
2019-10-12T07:20:50.52Z
# This is only accepted in RFC 3339 (without T)
2019-10-12 07:20:50.52Z
  • "Z" stands for Zero timezone or Zulu timezone UTC+0, and equal to +00:00 in the RFC 3339.
  • RFC 3339 follows the ISO 8601 DateTime format. The only difference is RFC allows us to replace "T" with "space".

Use RFC 3339 , which is following ISO 8601.

  • Timestamp: based on google.protobuf.Timestamp, see JSON mapping
  • Timezone: see ParseInLocation
  • DST: Daylight Savings Time. There is no plan to handle this boring stuff.
  • Datetime: excel format: yyyy-MM-dd HH:mm:ss, e.g.: 2020-01-01 05:10:00
  • Date: excel format: yyyy-MM-dd or yyMMdd, e.g.: 2020-01-01 or 20200101
  • Time: excel format: HH:mm:ss or HHmmss, e.g.: 05:10:00 or 051000
  • Duration: based ongoogle.protobuf.Duration , see JSON mapping
  • Duration: excel format: form "72h3m0.5s", see golang duration string form

Transpose

  • Interchange the rows and columns of a worksheet.

Validation

  • Min
  • Max
  • Range
  • Options: e.g.: enum type
  • Foreign key

Error Message

  • Report clear and precise error messages when converter failed, please refer to the programming language compiler
  • Use golang template to define error message template
  • Multiple languages support, focused on English and Simplified Chinese

Performace

  • Stress test
  • Each goroutine process one worksheet
  • Mutiple process model

Optimization

Contribution

Requirements

Protobuf

Goto Protocol Buffers v3.17.3, choose and download the correct platform of protoc, then install by README.

protoc-gen-go

Install: go install google.golang.org/protobuf/cmd/protoc-gen-go@v1.27.1