Reliquery
Science's Artifact Antiformat
An anti-format storage tool aimed towards supporting scientists. Giving them the ability to store data how they want and where they want. Simplifying the storage of research materials making them easy to find and easy to share.
Table of Contents
- Production
- Development
- Example
- HTML
- Images
- JSON
- Pandas DataFrame
- Files
- Jupyter Notebooks
- Query Relics
- Config
- File Storage
- S3 Storage
- License
For production
latest version 0.2.6
pip install reliquery
For development
Local Install
cd reliquery
pip install -e .
Quick Example Usage
from reliquery import Relic
import numpy as np
from IPython.display import HTML, Image
r = Relic(name="quick", relic_type="tutorial")
ones_array = np.ones((10, 10))
r.add_array("ones", ones_array)
np.testing.assert_array_equal(r.get_array("ones"), ones_array)
r.add_text("long_form", "Some long form text. This is something we can do NLP on later")
r.add_tag({"pass": "yes"})
r.add_json("json", {"One":1, "Two": 2, "Three": 3})
print(r.describe())
HTML supported
Add HTML as a string:
# Example
r.add_html_string("welcome", "<div><p>Hello, World</p></div>")
Add HTML from a file path:
# Example
r.add_html_from_path("figures", <path to html file>)
Get and display HTML using Reliquery:
# Read only S3 demo
r_demo = Relic(name="intro", relic_type="tutorial", storage_name="demo")
print(r_demo.list_html())
display(HTML(r_demo.get_html('nnmf2 resnet34.html')))
Images supported
Add images by passing images as bytes:
# Example
with open("image.png", "rb") as f:
r.add_image("image-0.png", f.read())
Get and display images:
print(r_demo.list_images())
display(Image(r_demo.get_image("reliquery").read()))
JSON supported
Add json by passing it in as a dictionary:
# Example
r.add_json("json", {"First": 1, "Second": 2, "Third":3})
List json
r.list_json()
Get json by taking the name and returning the dictionary
r.get_json("json")
Pandas DataFrame
Note that json is used to serialize which comes with other caveats that can be found here: https://pandas.pydata.org/pandas-docs/version/0.23/generated/pandas.DataFrame.to_json.html
#Example
d = {
"one": pd.Series([1.0, 2.0, 3.0], index=["a", "b", "c"]),
"two": pd.Series([1.0, 2.0, 3.0, 4.0], index=["a", "b", "c", "d"]),
}
df = pd.DataFrame(d)
r.add_pandasdf("pandasdf", df)
List pandasdf
r.list_pandasdf()
Get pandas dataframe by taking the name
r.get_pandasdf("pandasdf")
Files
#Example
r.add_files_from_path("TestFileName", test_file_path)
List files
r.list_files()
Get file
r.get_file("TestFileName")
Save file
r.save_files_to_path("TestFile", path_to_save)
Jupyter Notebooks
#Example
test_notebook = os.path.join(os.path.dirname(__file__), "notebook_test.ipynb")
r.add_notebook_from_path("TestNotebook", test_notebook)
List Notebooks
notebook_list = r.list_notebooks()
Get Notebook
r.get_notebook("TestNotebook")
Save Notebook to path
path_to_save = os.path.join(tmp_path, "testnotebook.ipynb")
r.save_notebook_to_path("TestNotebook", path_to_save)
View Notebooka via HTML
r.get_notebook_html(TestNotebook)
Query Relics
from reliquery import Reliquery
rel = Reliquery()
relics = rel.get_relics_by_tag("pass", "yes")
relics[0].describe()
Config
A json text file named config located in ~/reliquery
Default looks like...
{
"default": {
"storage": {
"type": "File",
"args": {
"root": "/home/user/reliquery"
}
}
},
"demo": {
"storage": {
"type": "S3",
"args": {
"s3_signed": false,
"s3_bucket": "reliquery",
"prefix": "relics"
}
}
}
}
File Storage
With this configuration, the relic will be persisted to:
/home/user/reliquery/relic_type/relic_name/data_type/data_name
In the quick example that will be:
/home/user/reliquery/reliquery/basic/relic_tutorial/arrays/ones
S3 Storage
s3_signed
- true = uses current aws_cli configuration
- false = uses the anonymous IAM role
License
Reliquery is free and open source! All code in this repository is dual-licensed under either:
- MIT License (LICENSE-MIT or http://opensource.org/licenses/MIT)
- Apache License, Version 2.0 (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
at your option. This means you can select the license you prefer.
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.