HOS-Discovery
Please select/open the topic:
Introduction
This repo describes the use of the TYPO3-Extensions discovery
. The module extends the typo3find extension of SUB Göttingen (subugoe) and realizes the Schaufenster for the HamburgOpenScience project "HOS-discovery"
Here are some screenshots:
Search with autocompleting
Heatmap with geolocations
Interactive DDC tree
Wordcloud of subjects
Installation
Solr
Requirement for the Solr installation is the installation of Java ≧7.
Java7
The installation of Java differs by platform:
UBUNTU
On Ubuntu we can use the package manager APT (Advanced Packaging Tool) to do this. To install Java, run the following command in a shell:
sudo apt-get update
sudo apt-get -y install default-jre
CENTOS
On CentOS we can use the package manager Yellowdog Updater (Yum) for installing Java. You can type the following command:
sudo yum install default-jre
Testing of Java
Once Java is installed, you can verify it by running the following command:
sudo java -version
Aspected output:
openjdk version "1.8.0_111"
OpenJDK Runtime Environment (build 1.8.0_111-b15)
OpenJDK 64-Bit Server VM (build 25.111-b15, mixed mode)
Downloading and Installing Apache Solr
First you will need to download the latest version of Apache Solr from the Apache website. You can easily download it using the wget command:
wget http://apache.org/dist/lucene/solr/7.3.1/solr-7.3.1.tgz
Please modify the version. You can see the available version under http://apache.org/dist/lucene/solr/
Once the download is completed, extract the service installation file with the following command:
tar xzf solr-7.3.1.tgz solr-7.3.1/bin/install_solr_service.sh --strip-components=2
Don't forget to modify the versions number.
Install Solr as a service by running the following command:
sudo bash ./install_solr_service.sh solr-7.3.1.tgz
Aspected output:
We recommend installing the 'lsof' command for more stable start/stop of Solr
Extracting solr-7.3.1.tgz to /opt
Installing symlink /opt/solr -> /opt/solr-7.3.1 ...
Installing /etc/init.d/solr script ...
Installing /etc/default/solr.in.sh ...
Service solr installed.
Customize Solr startup configuration in /etc/default/solr.in.sh
NOTE: Please install lsof as this script needs it to determine if Solr is listening on port 8983.
Started Solr server on port 8983 (pid=6426). Happy searching!
Found 1 Solr nodes:
Solr process 6426 running on port 8983
{
"solr_home":"/var/solr/data",
"version":"7.3.1 a66a44513ee8191e25b477372094bfa846450316 - shalin - 2018-11-02 19:52:42",
"startTime":"2016-11-30T06:49:18.927Z",
"uptime":"0 days, 0 hours, 0 minutes, 18 seconds",
"memory":"85.4 MB (%17.4) of 490.7 MB"}
This waring we will fix later.
You can start|stop|restart the Solr service with the following commands:
sudo service solr start
sudo service solr stop
sudo service solr restart
Avoiding of crashes
In /etc/security/limits.conf
you can add these lines:
solr soft nofile 500000
solr hard nofile 500000
solr soft nproc 65000
solr hard nproc 65000
These lines above suppress the warning during start of solr.
Creating of Solr index
sudo -u solr $(grep SOLR_INSTALL_DIR=/etc/init.d/solr | sed 's/\"//g' | sed 's/SOLR_INSTALL_DIR=//')/bin/solr create -c HOS)
Hint:
The cli part $(……)
greps the value of SOLR_INSTALL_DIR from init.d
script (in most cases /opt/solr
)
You can test with:
curl http://localhost:8983/solr/admin/cores
This command will show state of server in JSON format.
After successful installation of Java and Solr we will install LAMP as basement for TYPO3.
We tested the installation on Ubuntu and CentOS. Although that is a standard task for a developer we oint to these receipts for Ubuntu and CentOS
Criteria of completeness:
- mySQL server is running with database
typo3
, admin user and typo3 user, - PHP7.2 is installed with mysql, xml, json, gd, imagemagick, png, jpeg, gif
- up and running http servers
Apache
Some modifications of php
TYPO3 is greedy. Therefore it make sense to increase some parameters for PHP. You can do it with the script below.
sudo sed -i 's/max_execution_time = 30/max_execution_time = 240/' /etc/php/7.0/apache2/php.ini
sudo sed -i 's/; max_input_vars = 1000/max_input_vars = 1500/' /etc/php/7.0/apache2/php.ini
sudo sed -i 's/upload_max_filesize = 2M/upload_max_filesize = 8M/' /etc/php/7.0/apache2/php.ini
or you edit the properties in /etc/php.ini
manually. Don't forget to restart the httpd server by
sudo service apache1 restart
If this step was successful, we can prepare the apache server for delivering of our content.
.htaccess
The TYPO3 generated URLs are very long. For usage of simple URLs of document detail
pages we suggest teh usage of realurl module from Martin Poelstra/Kasper
Skårhøj/Дмитрий Дулепов. In next version this functionality will realized in
core. The internal links will generated from TYPO3, the routing must
realized by server. For this you need a .htaccess
in document root.
You have to copy this file into you document root. In our case
/var/www/servers/openscience.hamburg.de/web/
.
robots.txt
User-Agent: *
Disallow: /
Allow: /ID/
Copy this file to your document root.
Apache VirtualHost
For this we create a file named openscience.hamburg.de.conf
inside the folder /etc/apache2/sites-available/
.
with this content:
<VirtualHost *:80>
DocumentRoot /var/www/servers/openscience.hamburg.de/web
ServerName hosdev.sub.uni-hamburg.de
Options -Indexes
DirectoryIndex index.php
# Basic Auth for solrAdmin:
<Location /solrAdmin>
AuthType Basic
AuthName "Restricted Files"
AuthBasicProvider file
AuthUserFile "/etc/apache/.htpassword"
Require user solr
</Location>
ProxyPreserveHost On
ProxyRequests Off
# Tunneling for solrAdmin:
ProxyPass /solrAdmin http://localhost:8983/solr
ProxyPassReverse /solrAdmin http://localhost:8983/solr
</VirtualHost>
This configuration supports only http (without SSL). In production it makes sense to enable SSL. This you can do inside the apache.conf or in the load balancer.
Firewall
The firewall (ufw) only allows port 22 and 80.
Requests beginning with /solrAdmin will tunneled to native solr port 8983. With the script htpasswd we can add a user to /etc/apache2/.htpassword
Activating of the VirtualHost
For activating the configuration we have to set a symlink:
sudo ln -s /etc/apache2/sites-available/openscience.hamburg.de.conf /etc/apache2/sites-enabled/openscience.hamburg.de.conf
resp. for CentOS:
sudo ln -s /etc/httpd/sites-available/openscience.hamburg.de.conf /etc/httpd/sites-enabled/openscience.hamburg.de.conf
Creating solr-admin user
This command:
sudo htpasswd -c /etc/httpd/.htpassword solradmin
creates a new file .htpassword
inside apache root config (we have announced this in our host section) and adds a user solradmin
.
Now we can access the admin UI by URL like http://myserver.com/solrAdmin
.
On CentOS this folder is named /etc/httpd/.../
.
Testing of Apache (esp. PHP7.2)
For testing purpose you can place a little file (named info.php
) with this content:
<?php phpinfo(); ?>
into folder openscience.hamburg.de
. This script maybe is usefull:
sudo mkdir /var/www/servers/openscience.hamburg.de;\
sudo echo '<?php phpinfo(); ?>' > /var/www/servers/openscience.hamburg.de/info.php;\
sudo chown -R www-data.www-data /var/www/servers/*
Now you can open the website in a browser and call info.php. Here you can test the right version of php and the other stuff like mysql client
TYPO3
Installing composer
For installing TYPO3 and the extensions we use composer
.
First we install curl by:
sudo apt-get install curl // Ubuntu
resp.
sudo yum install curl // CentOS
Next, download the installer:
sudo curl -s https://getcomposer.org/installer | php
and move the composer.phar file:
sudo mv composer.phar /usr/local/bin/composer
Use the composer command to test the installation. If Composer is installed correctly, the server will respond with a long list of help information and commands:
user@localhost:~# composer
______
/ ____/___ ____ ___ ____ ____ ________ _____
/ / / __ \/ __ `__ \/ __ \/ __ \/ ___/ _ \/ ___/
/ /___/ /_/ / / / / / / /_/ / /_/ (__ ) __/ /
\____/\____/_/ /_/ /_/ .___/\____/____/\___/_/
/_/
Composer version 1.3.2 2017-01-27 18:23:41
Usage:
command [options] [arguments]
Options:
-h, --help Display this help message
-q, --quiet Do not output any message
Installation of TYPO3 with all needed extensions
The apache server is listening on port 80 and aspects the document root on
/var/www/servers/openscience.hamburg.de
We change to the parent of this folder and start:
sudo cd /var/www/servers/;\
sudo rm -rf openscience.hamburg.de/;\
sudo composer create-project -vvv subhh/discovery-distribution openscience.hamburg.de dev-master;\
sudo chown -R www-data.www.data *;\
sudo touch openscience.hamburg.de/web/FIRST_INSTALL
Potential pitfall
You have an other apache user, in this case you have to mofify line 5.
Please test if you are use the right DOCUMENT_ROOT inside your VirtualServer section. In our case it is:
/var/www/servers/openscience.hamburg.de/web
This web
trick is new in composer controled TYPO3 for avoiding git issues.
Don't forget to restart by:
sudo service httpd restart
after editing of apache configuration.
Configuration of TYPO3
In a browser of your choice you call the page i.e. http://openscience.hamburg.de
System environment check
The server redirect to typo3/sysext/install/Start/Install.php
and ask for some data:
Select database
After click you have to put your DB credentials into form:
After click you have to select an empty database. If the DB is filled (maybe if you have restarted the installation), you have to drop and create again (see. chapter about database).
Create user and import base data
After click you have to create an admin user:
Done!
After this step you can start the TYPO3 backend.
Activating of extensions
First you have to activate three extensions:
- scriptmerger
- find
- discovery
Potential pitfall during extension activation
In some cases the activating doesn't work. In this case the orange progress bar grows very slowly to right and stopps without message. Nothinh to see in error logs. In this case you can open typo3conf/PackageStates.php
and add this snippet to the end of the array:
'find' => [
'packagePath' => 'typo3conf/ext/find/',
],
'discovery' => [
'packagePath' => 'typo3conf/ext/discovery/',
],
'scriptmerger' => [
'packagePath' => 'typo3conf/ext/scriptmerger/',
],
After this you can test in extension manager of backend if the three modules are activated.
Adding of static templates
In section WEB/TEMPLATE you have to add static templates from extensions. Click here on Edit the whole template record
:
Here on Includes
And now you can add by clicking on the right side of table (Available Items
):
Don't forget to save. The save button is on top of section.
Adding plugin to page
In top section WEB/Page, you click on page under the root element and then choose first tab named General
. On this tab you can put the title of the page under Header
. In our case: "Hamburg Open Science: Discovery".
In tab Plugin
you can select Find
.
Adding setup to page template
plugin.tx_scriptmerger {
javascript {
compress.enable = 0
minify.enable =1
merge.enable = 1
}
css {
merge.enable = 1
compress.enable = 0
}
}
plugin.tx_find.settings.connections.default.options {
host = localhost
port = 8983
path = /solr/HOS
}
Discription of some features
The Discovery app uses an extended version of subugoe/find
. The most facet functionalities are realized with Javascript inside schaufenster extension.
Searchfield
The logic is implemented in file Resources/Public/Javascript/schaufenster.searchfield.js
.
The search field consists of three parts:
- input field(s)
- input selector
- submit button
Input field(s)
Every field will configured in typoscript (setup.txt).
Input selector
The original HTML element SELECT is difficult to style. Therefore we use a custome element following this instruction: https://www.w3schools.com/howto/howto_custom_select.asp The handling of selector changes the visibility of input fields. After changing of focus the recent field will emptied. After reload the selector will preselected.
Submit button
Clicking of Submit button submits the form.
Heatmap with geolocation of publications
Obviously the solr query generates more hits then a common map api can process. There are more then one render modes. The most known is a cluster manager. The API limits the number of markers in a map. In our case we have only a couple of geo locations but a big number of hits on one location. In this case a heatmap is a good solution. The model consists of a collection of geolocations with optional value for every location.
The UI has two parts: a "thumbnail" in facet column
and a big version in a lightbox overlay:
The project uses Leaflet as framework and API. This is an open source library for handling of slippy tile maps. Most mapping providers (like google, mapbox, bing, osm) work with this technology. The world map is divided in a fixed grid of tiles (in most cases 256x256px) for all zoom levels. An other technology (wms) renders the maps in real time on server. The most modern technology solution realizes the rendering on client and only vector data will be transfered from server to client.
Wordcloud with subjects
The script /Resources/Public/Javascript/schaufenster.wordcloud.js
reads all subjects from subject facet and substitutes the old DOM part with the new one. The used d3 library is a singleton. Therefore it is not possible to realize both (small and large one) in the same namespace. The large part is realized with an iframe to create a new html page.
Creators as Dounut
The simple logic is realized in /Resources/Public/Javascript/schaufenster.publisher.js
. Basically the old DOM will substitute with the new one.
DDC as (file-) tree
The Dublin Core Schema is a small set of vocabulary terms that can be used to describe digital resources (video, images, web pages, etc.), as well as physical resources such as books or CDs, and objects like artworks. The first 3 levels are licence free.
The facet ddc contains only the numbers of ddc. The resolving of these numbers to labels will process in the Javascript layer. Server delivers only a simple list, this list will be transformed into a tree model.
The script /Resources/Public/Javascript/schaufenster.ddc.js
replaces the original DOM part into a graphical tree.
Adding of new facet components
Currently all new components are realized with pure jQuery. This page describes the clean, TYPO3-conform way.
Working environment
In this receipt some details how you can work with standard UI programms remote via sshfs