python-mid

MID module


License
MIT
Install
pip install python-mid==1.0.0

Documentation

python-mid

Efficient ID generation, lexicographically sortable, contains time information.

Structure

  • 4 bytes for the timestamp
  • 8 bytes of randomness
  • encoded as urlsafe base64
  • 16 characters
  • Example: ZEDpWWfGaXNR_0rs

Usage

>>> import mid
>>> mid.generate_mid()
'ZEDpWWfGaXNR_0rs'
>>> mid.get_date_time("ZEDpWWfGaXNR_0rs")
datetime.datetime(2017, 12, 28, 20, 22, 24)
>>> mid.get_time("ZEDpWWfGaXNR_0rs")
1514474944

Performance

Bench #1: Generate 1.000.000 IDs

UUID1 UUID4 ULID ObjectID MID
Execution Time 2.96s 3.55s 5.88s 5.43s 0.28s
Executions/s 337612.24 281647.68 170118.74 184101.31 3556948.61

FAQ

Are the resulting IDs always lexographically sortable?

Yes, the resulting MIDs generated by the generate_mid function in the C extension code I provided earlier are always lexicographically sortable.

The first four bytes of each MID represent the current time in seconds since the epoch. These bytes are stored in big-endian order, which means that MIDs generated at later times will have a larger numerical value for these first four bytes. Since the base64 encoding used to encode the MIDs preserves the lexicographical order of the input data, this means that MIDs generated at later times will also be lexicographically greater than MIDs generated at earlier times.

The remaining eight bytes of each MID are filled with random data and do not affect the lexicographical order of the MIDs.

Therefore, if you generate multiple MIDs using the generate_mid function and sort them lexicographically, they will be sorted in chronological order based on their creation time.

What about January 19, 2038?

The generate_mid function stores the current time in seconds since the epoch as a 4-byte integer in big-endian order. This means that the maximum value that can be represented by these 4 bytes is 2^32 - 1, or 4294967295.

Since the epoch is defined as January 1, 1970, this means that the latest time that can be represented by these 4 bytes is 4294967295 seconds after the epoch, or approximately 07:14:07 UTC on February 7, 2106.

After this time, the value of the current time in seconds since the epoch will no longer fit into 4 bytes and will overflow. This will cause the time value stored in the MIDs to wrap around and start counting from zero again.