Documentation
Translated from ä¸æ–‡æ–‡æ¡£ by GPT 3.5
A tool for quickly binding dataclass classes to ArgumentParser to achieve rapid customization of command-line parameters. This binding supports most parameters of ArgumentParser, such as help documentation, whether it is a required parameter, multiple values, etc., as detailed below. At the same time, this tool supports Python's type hints as much as possible.
Quick Start
Python version: >=3.7
pip install parser-binding
demo.py
:
from dataclasses import dataclass, field
from enum import Enum
from parser_binding import BindingParser, Field
class LogLevel(Enum):
DEBUG = 'debug'
INFO = 'info'
WARNING = 'warning'
ERROR = 'error'
@dataclass
class TestOptions:
input_file: str = Field(
default=None, aliases=['i'], help='The input file to read.'
)
workers: int = 1
logging_level: LogLevel = LogLevel.WARNING
verbose: bool = False
if __name__ == '__main__':
parser = BindingParser(TestOptions)
options = parser.parse_into_dataclasses((TestOptions, ))
print(options)
Execute python demo.py -h
, you will get the following output:
usage: demo.py [-h] [-i INPUT_FILE] [--workers WORKERS] [--logging-level {debug,info,warning,error}] [--with-verbose]
optional arguments:
-h, --help show this help message and exit
-i INPUT_FILE, --input-file INPUT_FILE, --input_file INPUT_FILE
The input file to read. Optional. Default `None`.
--workers WORKERS Optional. Default `1`.
--logging-level {debug,info,warning,error}, --logging_level {debug,info,warning,error}
Optional. Default `warning`.
--with-verbose, --with_verbose
Optional. Default `False` as verbose disabled.
Execute python demo.py -i ./test.txt --workers 2 --logging-level debug --with-verbose
, and you will automatically get an instance of the dataclass with corresponding parameters read from the command-line:
_MergedDataClass@1704877855657(input_file='./test.txt', workers=2, logging_level=<LogLevel.DEBUG: 'debug'>, verbose=True)
Formatting Guide
Command-line parameter names will be formatted into three types:
-
--xxx-yyy
, when the class property name contains_
, it will be converted to a command-line parameter name starting with--
and separated by-
. -
--xxx_yyy
, when the class property name contains_
, it will be converted to a command-line parameter name starting with--
and separated by_
as an alternative. -
-x
, shortcut option support, when the property name is a single character or contains a single character in the specified alias, it will use the shortcut option-
as one of the parameters.
Currently, it supports most types in Python:
-
Basic Types
-
int
: Integer type, automatically converts command-line parameters to integers. -
float
: Floating-point type, automatically converts command-line parameters to floating-point numbers. -
str
: String type, automatically converts command-line parameters to strings. -
bytes
: Byte type, supports direct conversion of strings in UTF-8 encoding, does not support other complex data, if other conversions are needed, it can be implemented by specifying the type. -
bool
: Boolean type, automatically adds a switch option to the command-line, the switch name is influenced by the default value; when the default value isTrue
, the switch name is formatted as--without-xxx-yyy
and--without_xxx_yyy
, when the default value isFalse
, the switch name is formatted as starting with--with
. Note: Boolean value types will only be formatted as switches and do not need to pass parameters on the command-line. Pay attention to the change in parameter names caused by changes in default values.
-
-
Types with Generic Annotations: When using generic annotations, please explicitly specify the type inside the generic. Complex types such as
Union[str, int]
are not directly supported:-
Optional
,Used to define optional properties, provide a default value when using this annotation. For all properties with default values, the corresponding parameters on the command-line will be treated as optional parameters and provide default values. -
List
/list
: When using theList
annotation, please specify the type of the list elements explicitly, such asList[int]
. When using this type, the command-line parameter corresponding to it will be set as a multi-value type, and spaces will be used as the default separator for each value, such as--multiple 1 2 3
. The explicit element type will format each value passed in as the target type, that is, each element in theList[int]
property will be of typeint
, and not the result ofList[str]
. If using thelist
annotation, there is no additional type mapping for the passed elements by default. If additional type mapping is needed, please specify the type mapping using the type parameter (as shown in the example later). -
Tuple
/tuple
: Same asList
orlist
, the difference lies in the final collection class output. -
Set
/set
: Same as above. -
Deueue
/dequeue
: Same as above. -
Queue
: Same as above. These collection classes default to using a space as the separator between multiple values. If a custom separator is needed, it should be specified usingsep
(as shown in the subsequent example). -
Dict
/dict
: When usingDict
as the property type, please explicitly specify the key and value types. If usingdict
as an annotation, no mapping will be done for the key/value types. By default, the command-line parameters passed in will be treated as a JSON-string for parsing. If it cannot be parsed, it will try to parse it as a JSON file (in this case, pass in a JSON file directory). If both methods fail, it will result in failure. -
Enum
: When using an enumeration class, the command-line will construct an optional value parameter, and the input will be limited to a fixed list of values. The command-line will pass in a string, but during parsing, it will get a corresponding enumeration and provide it to the dataclass instance initialization. -
Literal
: Requirespython >= 3.8
support, equivalent toEnum
, but does not produce an enumeration.
-
-
Complex Types
- File type: For ease of reading from the command-line or setting output devices, you can quickly open files using this method to reduce the overhead of calling
open
in the code. - Nested or unknown types: To further expand the scenarios of complex types, support custom type conversion for command-line passing values through the
Field
andtype
parameters.
- File type: For ease of reading from the command-line or setting output devices, you can quickly open files using this method to reduce the overhead of calling
Usage Examples
Basic Type Scenario
from dataclasses import dataclass
from parser_binding import BindingParser
@dataclass
class TestOption():
required_float: float
a: int = 0
string_type: str = None
switch: bool = False
bytes_data: bytes = None
parser = BindingParser(TestOption)
options = parser.parse_into_dataclasses((TestOption, ))
print(options)
python3 basic-demo.py --required-float 0.1 -a 10 --string-type 'Hello Parser!' --with-switch --bytes-data 'Hello Parser!'
Result:
_MergedDataClass@1704880154551(required_float=0.1, a=10, string_type='Hello Parser!', switch=True, bytes_data=b'Hello Parser!')
You can observe that --required-float
is a required parameter; otherwise, parsing will fail.
For parameters without default values, parser-binding
will consider them as required parameters; otherwise, provide default values.
typing
Annotation Types
from dataclasses import dataclass
from typing import List
from parser_binding import BindingParser
@dataclass
class TestOption:
data: List[int] = None
items: list = None
parser = BindingParser(TestOption)
opt = parser.parse_into_dataclasses((TestOption, ))
print(opt)
Execute python list-demo.py --data 1 2 3 --items 1 2 3
to parse and get the result _MergedDataClass@1704880514613(data=[1, 2, 3], items=['1', '2', '3'])
。
From the above difference, if using the List[int]
annotation, each element in the list will be converted to an integer, otherwise, no conversion will be performed.
If conversion is required, the type
should be specified, as shown below, changing the items
annotation to:
items: list = Field(default=None, type=int)
Execute python list-demo.py --data 1 2 3 --items 1 2 3
, and the parsed result will be _MergedDataClass@1704881428343(data=[1, 2, 3], items=[1, 2, 3])
, where the type is now correctly converted.
For set
, tuple
, deque
, queue
, and other collection types, the effect is the same.
Note: If the property is annotated with list
, tuple
, set
, queue
, etc., the command-line will automatically convert the corresponding parameter to a multi-value parameter separated by spaces. At this time, if you specify type using Field
, type should correspond to the class of the elements, not the final property collection type. For example, in the above Field(default=None, type=int)
, type
specifies the element type
as int
, i.e., the property type is List[int]
.
Given the default behavior of command-line multi-value parameters, which default to being separated by spaces, further support for more separators can be achieved through Filed
by specifying sep
.
For example: items: tuple = Field(sep=',', default=None, type=int)
, will result in a Tuple[int]
parsing result, and the input parameters will be separated by ,
. In this case, you should run the following command: python list-demo.py --data 1 2 3 --items 1,2,3
, and the parsing result will be _MergedDataClass@1704893904876(data=(1, 2, 3), items=(1, 2, 3)).
。
parser-binding
also supports dictionary types, used to support JSON-like formats. When using Dict
as the type annotation, you need to specify the key and value types so that they can be correctly converted. If the JSON string contains key-value pairs of multiple types, you can directly use dict
for annotation, as shown in the following example:
from dataclasses import dataclass
from typing import Dict
from parser_binding import BindingParser, Field
@dataclass
class TestOption:
data: Dict[str, int] = None
items: dict = None
parser = BindingParser(TestOption)
opt = parser.parse_into_dataclasses((TestOption, ), )
print(opt)
Execute python dict-demo.py --data '{"1": "1", "2": "2"}' --items '{"1": "1", "2": "2"}'
, and the parsing result will be _MergedDataClass@1704944849798(data={'1': 1, '2': 2}, items={'1': '1', '2': '2'})
。
From the above parsing result, when using dict
for annotation, the key-type and value-type will remain consistent with the original JSON, but when using Dict
and specifying types, the key/value will be converted to the corresponding types.
Enum Types
Enum types allow the command-line to define an optional value parameter, restricting the input to a fixed list of values, making the selection from the command-line effective.
As shown in the example below:
from dataclasses import dataclass
from enum import Enum
from parser_binding import parse_args
class Mode(Enum):
train: str = 'train'
eval: str = 'eval'
@dataclass
class TestOption:
mode: Mode = Mode.train
args = parse_args((TestOption, ))
print(args.mode)
At this point, check the usage with usage: enum-demo.py [-h] [--mode {train,eval}]
, indicating that only the values train
and eval
are supported.
Enum classes are equivalent in effect to annotations like Literal
。
Complex Type
To further support more scenarios, such as nested collections, multi-type inference, file reading and writing, etc., parser-binding
supports implementation through Field
combined with type
.
We allow the use of Optional
or Union[x, None]
to declare optional properties, but we do not yet support multi-type inference, such as Union[str, int]
, or nested types like List[List[int]]
.
These types are considered complex types and require additional type configuration!
By default, for complex types when no type
configuration is specified, we won't perform any processing. Consequently, the parameter values read will be of type str
. This may lead to unexpected results, so please use caution.
Recommended: Provide the help
argument for complex types to help the command-line understand how to pass values.
Nested Types
In complex scenarios, we use List[List[int]]
as an example for parsing, as shown in the following code:
from dataclasses import dataclass
from typing import List
from parser_binding import parse_args
@dataclass
class TestOption:
data: List[List[int]] = None
args = parse_args((TestOption, ))
print(args)
At this time, the data
property is not given a type
argument and is treated as a complex type, resulting in the following warning:
UserWarning: The filed "data" is complex but there is no type specified, this could make an error.
At this time, directly passing values from the command-line will result in the result: _MergedDataClass@1704955154188(data='1,2,3')
, indicating that data
does not match the expected type List[List[int]]
.
Therefore, a reasonable practice should be:
from dataclasses import dataclass
from typing import List
from parser_binding import Field, parse_args
@dataclass
class TestOption:
data: List[List[int]] = Field(
default=None,
type=lambda x:
list(map(lambda x: list(map(int, x.split(':'))), x.split(',')))
)
args = parse_args((TestOption, ))
print(args)
Executing python complex-demo.py --data 1:4:5,2,3
will result in _MergedDataClass@1704955285471(data=[[1, 4, 5], [2], [3]])
, which aligns with the expected type annotations.
File Type
To facilitate the input of files from the command line, parser-binding
supports simple text reading and writing methods, making it easy to obtain file instances directly from a data class without the need for open
method calls. Here is an example:
from dataclasses import dataclass
from typing import IO, Iterable, TextIO
from parser_binding import Field, parse_args
@dataclass
class TestOption:
in_file: Iterable[str] = Field(
default=None, file=True, file_mode='r', file_encoding='utf-8'
)
# in_file: IO[str] = Field(default=None, file='r', file_encoding='utf-8')
out_file: IO[str] = Field(
default='-', file_mode='w', file_encoding='utf-8'
)
args = parse_args((TestOption, ))
for l in args.in_file:
print('o', l.strip(), sep='\t', file=args.out_file)
For file handling, there are two annotation methods, as seen in the example above with the in_file
property. The difference lies in whether the file
argument needs to be specified as True
when using the IO
annotation. Otherwise, it must be specified as True
, or else it will be treated as a complex type.
Suppose the content of the existing test-file.txt
is:
a
b
c
By executing python file-demo.py --in-file test-file.txt
, the above code will produce the output:
o a
o b
o c
For file types, when the default value is '-'
, it is automatically set to stdin
or stdout
, depending on the file's read/write opening mode.
File types support plain text files. If the file name ends with .gz
and the property type annotation is a generic str
, such as IO[str]
/TextIO
, it will open the file using gzip.open()
+ rt
mode.