pyhdfs-client : Powerful HDFS Client for python
Why it's fast powerful?
Native hdfs client offers much better performance than webhdfs clients. However calling native client for hadoop operations have an additional overhead of starting jvm. pyhdfs-client brings the performance of native hdfs client without any overhead of starting jvm on every command execution.
- HDFS client for python
- Easy to integrate with python applications
- Better Performance than webhdfs clients
- Provide native hadoop client performance without any overhead
- Support both UNIX and Windows
Whats new in 0.1.3?
- Multiple instances of HDFS client enabled.
- [fix] Temporary folder deletion
- [fix] Java process shutdown issues on UNIX
Installation
pip install pyhdfs-client
Requirements: hadoop binaries and py4j installed
>>> from pyhdfs_client.pyhdfs_client import HDFSClient
>>> hdfs_client = HDFSClient()
>>> ret, out, err = hdfs_client.run(['-ls', '/'])
>>> print(out)
Found 1 items
drwxr-xr-x - gp supergroup 0 2021-03-21 01:10 /f1
>>> hdfs_client.stop() # to terminate hdfs client
Contribution
- Any contribution for enhancements and bug fixes is welcome.
- This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.