Customizable loader for castor
Create, transform, compute new fields and more ... see castor-load
You can create new fields in each document, by tweaking the ad hoc JSON configuration file (the one located besides the data directory you give to castor, or the one you give in parameters).
documentFields
All the settings concerning custom fields are put in documentFields
key in that configuration file.
If you want ot create a fields.title
key in your documents, add:
"documentFields": {
"title" : {
"path" : "content.json.Ti"
}
}
The path
key points to a location inside the original document,
using the dot notation.
There are many options (keys like path
) you can use, see
Options.
Note: the generated fields are truncated at 250 characters (if they are of string type).
Multivalued fields
Maybe your fields are multivalued, for example, if you load csv
files.
For example, in a Keywords
columns, you have such values:
Dashboard; Nodejs; Github
Web; Dashboard; Statistics
The direct way, is to point to content.json.keywords
, but that will
distinguish the Dashboard
from the first row to the one from the second row.
Moreover, they will be bound to other keywords on the same row.
The solution is to add a custom field in the JSON configuration file:
"documentFields" : {
"Keywords" : {
"path" : "content.json.Keywords",
"separator" : ";"
}
},
Options
TODO
path
Dotted notation to a field within the original document.
Required (with the exception of compute
or select
)
Ex:
"documentFields" : {
"doi" : {
"path" : "content.json.doi"
}
}
select
CSS-like selectors. See JSONSelect
Required (with the exception of compute
or select
)
Ex:
"documentFields" : {
"doi" : {
"path" : ".item ~ .doi"
}
}
label
Label (in UTF-8, without any constraint): the name of the field to display in pages of the application. Optional
Ex:
"documentFields" : {
"doi" : {
"label" : "Document Object Identifier"
}
}
Values can be multiform:
- array of objects:
[{ "lang" : "XX", "$t": "The label" }]
- object:
{ "en" : "Hello", "fr": "Bonjour" }
- string
coalesce
Boolean: if path
is an array, and coalesce
is false
,
the computed field will be an array (one value for one path).
Default value: false
But, it can produce mainly null
values. One may want to obtain
the first defined value.
Ex:
{
"documentFields" : {
"year" : {
"path" : ["foo","content.json.Py"],
"coalesce": true,
}
}
When coalesce
is set to true
, the value will be a string:
{
"content": {
"json": {
"Py" : "2015"
}
},
"fields": {
"year": "2015"
}
}
Otherwise it would be an array, like this one (this is the default behavior):
{
"content": {
"json": {
"Py" : "2015"
}
},
"fields": {
"year": [ null, "2015" ]
}
}
default
Default value, used when the field has no value (for example
when the path
is not present in the document). Optional
Ex:
"documentFields" : {
"title" : {
"default" : "No title given"
}
}
glue
When the already computed field is an array, and that glue
is set,
each value of the array is joined in a string, using glue
between
every value. Optional
transform
Apply any stringjs chain to the field's value. Optional
Ex:
"documentFields" : {
"slug" : {
"path" : "content.json.title"
"transform" : "slugify()"
}
}
textizer
Apply any stringjs chain to the field's value, then add the result to the fulltext index of the document. Optional
Ex:
"documentFields" : {
"title" : {
"path" : "content.json.title"
"textizer" : "toString()"
}
}
Ex:
"documentFields" : {
"Keywords" : {
"path" : "content.json.Kw"
"textizer" : "ensureRight(' ')"
}
}
Note: for fulltext search to work, you have to enable it in mongodb.
For that, add to your mongodb.conf
(maybe located in /etc/mongodb.conf
):
setParameter=textSearchEnabled=true
separator
Split the fields' value into an array, depending on the separator. Optional
Ex:
"documentFields" : {
"Keywords" : {
"path" : "content.json.Keywords",
"separator" : ";"
}
},
type
Transtype the custom field value, in order to be used with another
type than string by compute
, or used by a filter... (values: boolean
, string
, text
, number
, date
).
Optional
Ex:
"documentFields" : {
"Year" : {
"path" : "content.json.Py",
"type" : "number"
}
},
pattern
Mask (or pattern) used to validate the variable. Optional
Values depend on type
:
- REGEX for
text
andstring
- date format for
date
compute
Compute a funex expression, on
the already generated documentFields
.
You can access to the documentFields.Year
simply using Year
.
Ex:
"documentFields" : {
"Authors" : {
"path" : "content.json.Af",
"separator" : ";"
},
"AuthorNb" : {
"compute" : "Authors.length"
}
},
visible
Set it to true to indicate that this custom field should appear whenever the theme needs to display custom fields. Optional (default value: false)
Ex:
"documentFields" : {
"Authors" : {
"path" : "content.json.Af",
"visible" : true
},
"authors" : {
"path" : "content.json.Af",
"separator" : ";"
}
},
required
Set it to true to indicate that this custom field should be exists to load document Optional (default value: false)
Ex:
"documentFields" : {
"check" : {
"path" : "content.json.title",
"required" : true
}
},
noindex
Set it to true to indicate that this custom field should not be indexed (and truncated) by MongoDB Optional (default value: false)
Ex:
"documentFields" : {
"abstract" : {
"path" : "content.json.abstract",
"noindex" : true
}
},
mapping
Data mappings between the value of a custom field and a static value. Static values could declared as hash table or as array Optional
Ex with array:
"documentFields" : {
"Month" : {
"path" : "content.json.month",
"type": "number",
"mapping" : [
"janvier",
"février",
"mars",
"avril",
"mai",
"juin",
"juillet",
"août",
"septembre",
"octobre",
"novembre",
"décembre"
]
}
},
Ex with hash table:
"documentFields" : {
"Month" : {
"path" : "content.json.month",
"mapping" : {
"JAN" : "janvier",
"FEV" : "février",
"MAR" : "mars",
"AVR" : "avril",
"MAI" : "mai",
"JUN" : "juin",
"JUL" : "juillet",
"AOU" : "août",
"SEP" : "septembre",
"OCT" : "octobre",
"NOV" : "novembre",
"DEC" : "décembre"
}
}
},