active_stash

This gem wraps stash.rb so that you can use CipherStash to add searchable encryption to your models


Keywords
activerecord, cipherstash, data-security, encryption, rails, ruby, searchable-encryption
License
MIT
Install
gem install active_stash -v 0.4.2

Documentation

ActiveStash

ActiveStash is the Rails-specific gem for using CipherStash.

ActiveStash gives you encryption search on ActiveRecord models using application level encryption (using libraries like Lockbox and ActiveRecord Encryption).

When records are created or updated, they are indexed into a CipherStash collection which can be queried via an ActiveStash::Relation.

TL;DR - here's a video demo

Searchable Encrypted Rails models with ActiveStash!

Searchable Encrypted Rails models with ActiveStash

What is CipherStash?

Field-level encryption is a powerful tool to protect sensitive data in your Active Record models. However, when a field is encrypted, it can't be queried! Simple lookups are impossible let alone free-text search or range queries.

This is where CipherStash comes in. CipherStash is an Encrypted Search Index and using ActiveStash allows you to perform exact, free-text and range queries on your Encrypted ActiveRecord models. Queries use ActiveStash::Relation which wraps ActiveRecord::Relation so most of the queries you can do in ActiveRecord can be done using ActiveStash.

How does it work?

ActiveStash uses the "look-aside" pattern to create an external, fully encrypted index for your ActiveRecord models. Every time you create or update a record, the data is indexed to CipherStash via ActiveRecord callbacks. Queries are delegated to CipherStash but return ActiveRecord models so things just work.

If you've used Elasticsearch with gems like Searchkick, this pattern will be familiar to you.

Active Stash Lookaside Pattern

Getting started

  1. Add ActiveStash to your Gemfile:
gem "active_stash"
  1. Install the new dependencies:
➜  bundle install
  1. Create a CipherStash account (which will provision you a workspace) and then login:
➜  rake active_stash:login[YOURWORKSPACEID]

Note: If you are using zsh you may need to escape the brackets

rake active_stash:login\['WorkspaceId'\]
  1. Any model you use with ActiveStash::Search needs to have a stash_id column, to link search results back to database records.

For example, to add a stash_id column to the database table for the User model, add the below migration:

$ rails g migration AddStashIdToUser stash_id:string:index
$ rails db:migrate
  1. Add the ActiveStash::Search mixin to your user model, and declare what fields are searchable:
# app/models/user.rb
class User < ApplicationRecord
  include ActiveStash::Search

  # Previously added application-level encryption, by either ActiveRecord Encryption or Lockbox
  encrypts :name, :email
  encrypts :dob, type: :date

  # Fields that will be indexed into CipherStash
  stash_index :name, :email, :dob

  self.ignored_columns = %w[name email dob]
end
  1. Reindex your existing data into CipherStash with ActiveStash
➜  rails active_stash:reindexall
  1. Query a user record:
$ rails c
 >> User.where(email: "grace@example.com").count
 => 0 # no records, because the database isn't searchable
 >> User.query(email: "grace@example.com").count
 => 1 # a record, because CipherStash makes your encrypted database searchable

Installation

Add this line to your applications Gemfile:

gem 'active_stash'

And then execute:

$ bundle install

To use, include ActiveStash::Search in a model and define which fields you want to make searchable:

class User < ActiveRecord::Base
  include ActiveStash::Search

  stash_index :name, :email, :dob

  # fields encrypted with EncryptedRecord
  encrypts :name
  encrypts :email
  encrypts :dob

  # ...the rest of your code
end

Any model in which you include ActiveStash::Search, will need to have a stash_id column added of type string. For example, to add this to the table underlying your User model:

$ rails g migration AddStashIdToUser stash_id:string:index
$ rails db:migrate

The above command also ensures that an index is created on stash_id.

Configuration

ActiveStash supports all CipherStash configuration described in the docs.

In addition to configuration via JSON files and environment variables, ActiveStash supports Rails config and credentials.

For example, to use a specific profile in development, you could include the following in config/environments/development.rb:

Rails.application.configure do
  config.active_stash.profile_name = "dev-profile"

  # Other config...
end

For secrets, you can add ActiveStash config to your credentials (rails credentials:edit --environment <env>):

active_stash:
  aws_secret_access_key: your_secret

You can also use an initializer (e.g. config/initializers/active_stash.rb):

ActiveStash.configure do |config|
  config.aws_secret_access_key = Rails.application.credentials.aws.secret_access_key
end

Index Types

CipherStash supports 3 main types of indexes: exact, range (allows queries like < and >) and match which supports free-text search.

ActiveStash will automatically determine what kinds of indexes to create based on the underlying data-type. These are as follows:

String and Text

:string and :text types automatically create the following indexes. Range indexes on strings typically only work for ordering.

Indexes Created Allowed Operators Example
match =~ User.query { |q| q.name =~ "foo" }
exact == User.query(email: "foo@example.com)
range <, <=, ==, >=, > User.query.order(:email)

Numeric Types

:timestamp, :date, :datetime, :float, :decimal, and :integer types all have range indexes created.

Indexes Created Allowed Operators Example
range <, <=, ==, >=, >, between User.query { |q| q.dob > 20.years.ago }

Overriding Automatically Created Indexes

If you need finer grained control over what types of indexes are created for a field, you can pass the :except or :only options to stash_index (can be a symbol or array).

For example, to on create an :exact index for an integer field, you could do:

stash_index :my_integer, only: :exact

To exclude the :range from a string type (say if you don't need to order by string), you can do:

stash_index :my_string, except: :range

Match All Indexes

ActiveStash can also create an index across multiple string fields so that you can perform free-text queries across all specified fields at once.

To do so, you can use the stash_match_all DSL method and specify the fields that you want to have indexed:

stash_match_all :first_name, :last_name, :email

Match all indexes are queryable by passing the query term directly to the query method. So to search for the term "ruby" across :first_name, :last_name and :email you would do:

User.query("ruby")

For more information on index types and their options, see the CipherStash docs.

Create a CipherStash Collection

Before you can index your models, you need a CipherStash collection. ActiveStash will create indexes as defined in your models.

All you need to do is create the collection by running:

rails active_stash:collections:create

This command will create collections for all the models you have set up to use ActiveStash.

(Re)indexing

To index your encrypted data into CipherStash, use the reindex task:

$ rails active_stash:reindexall

If you want to just reindex one model, for example User, run:

$ rails active_stash:reindex[User]

You can also reindex in code:

User.reindex

Depending on how much data you have, reindexing may take a while but you only need to do it once. ActiveStash will automatically index (and delete) data as it records are created, updated and deleted.

Current limitations

Presently, ActiveStash provides no means to update the schema of a CipherStash collection. Therefore if you need to make any changes to the Collection schema itself (by using the stash_index or stash_match_all helpers) you must drop your collection and recreate it.

If your indexed model is called User for example, you should run the following commands:

$ rails active_stash:drop[User]
$ rails active_stash:create
$ rails active_stash:reindex[User]

Support for zero-downtime Collection schema changes and reindexing is being actively worked on and will be available soon.

When to Reindex Your Collection

These are the rules for when you must re-index your collection:

  1. You have imported, deleted or updated data in the table that backs your ActiveStash model via some external mechanism, OR
  2. You have added or removed a string/text column from the table that backs your ActiveStash model and you are using a dynamic_match index in your model

When to Drop, Recreate and Reindex Your Collection

This is the rule to determine when you must drop, recreate and reindex your collection:

  1. Whenever add or modify one or more ActiveStash index definitions in your model

See Current Limitations for instructions on what commands to run to accomplish this.

NOTE: technically, you do not need to reindex your collection if you remove an index definition on your model. A removed index definition will not remove the index stored in CipherStash and it will not be useable in queries, but it will still be incurring CPU & network costs to keep it up to date.

Running Queries

To perform queries over your encrypted records, you can use the query method For example, to find a user by email address:

User.query(email: "person@example.com")

This will return an ActiveStash::Relation which extends ActiveRecord::Relation so you can chain most methods as you normally would!

To constrain by multiple fields, include them in the hash:

User.query(email: "person@example.com", verified: true)

To order by dob, do:

User.query(email: "person@example.com).order(:dob)

Or to use limit and offset:

User.query(verified: true).limit(10).offset(20)

This means that ActiveStash should work with pagination libraries like Kaminari.

You also, don't have to provide any constraints at all and just use the encrypted indexes for ordering! To order all records by dob descending and then created_at, do (note the call to query with no args first):

User.query.order(dob: :desc, :created_at)

Advanced Queries

More advanced queries can be performed by passing a block to query. For example, to find all users born in or after 1998:

User.query { |q| q.dob > "1998-01-01".to_date }

Or, to perform a free-text search on name:

User.query { |q| q.name =~ "Dan" }

To combine multiple constraints, make multiple calls in the block:

User.query do |q|
  q.dob > "1998-01-01".to_date
  q.name =~ "Dan"
end

Overriding the Collection Name

To set a different collection name, you can set one in your model:

class User < ActiveRecord::Base
  include ActiveStash::Search
  self.collection_name = "mycollection"
end

Setting a Default Scope

If you plan to use encrypted queries for all the data in your model, you can set a default scope:

class User < ActiveRecord::Base
  include ActiveStash::Search

  def self.default_scope
    ActiveStash::Relation.new(self)
  end
end

Now, all queries will use the CipherStash collection, even if you don't call query. For example, this will use encrypted indexes to order:

User.order(:dob)

# Without a default scope you'd need to call
User.query.order(:dob)

Managing Access Keys

Access keys are secret credentials that allow your application to authentication to CipherStash when it is running in a non-interactive environment (such as production or CI). ActiveStash provides rake tasks to manage the access keys for your workspace.

To create a new access key:

rake active_stash:access_key:create[keyname]

To list all the access keys currently associated with your workspace:

rake active_stash:access_key:list

Finally, to delete an access key:

rake active_stash:access_key:delete[keyname]

Every access key must have a unique name, so you know what it is used for (and so you don't accidentally delete the wrong one). You can have as many access keys as you like.

Collection Management

Drop a Collection

You can drop a collection directly in Ruby:

User.collection.drop!

Or via the included Rake task. This command takes the name of the model that is attached to the collection.

rake active_stash:collections:drop[User]

List Stash Enabled Models

A rake task is provided to list all of the models in your application that have been configured to use CipherStash.

rake active_stash:collections:list

Create a Collection

You can also create a collection for a specific model in Ruby:

User.collection.create!

Or via a Rake task:

rake active_stash:collections:create[User]

Development

After checking out the repo, run bin/setup to install dependencies.

The test suite depends on a running postgres instance being available on localhost. You'll need to export a PGUSER env var before running the test suite.

Then, run rake spec to run the tests. You can also run bin/console for an interactive prompt that will allow you to experiment.

To install this gem onto your local machine, run bundle exec rake install. To release a new version, update the version number in version.rb, and then run bundle exec rake release, which will create a git tag for the version, push git commits and tags, and push the .gem file to rubygems.org.

Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/cipherstash/activestash. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the code of conduct.

License

The gem is available as open source under the terms of the MIT License.

Code of Conduct

Everyone interacting in the Activestash project's codebases, issue trackers, chat rooms and mailing lists is expected to follow the code of conduct.