Inspire installation from sources
About Inspire source code repository
Inspire uses Git as its source code management system. The official
source code repository is hosted alongside the standard Invenio source
code repository and can be browsed via a
git repo web interface
. For example, you
can subscribe to an RSS feed there in order to follow recent Inspire
source code changes.
The main repository is named
inspire
and it is an "overlay" repository containing customization of templates and formats to change the "look and feel" as well as functionality of the
Inspire site
.
An additional repository called
invenio-inspire-ops
is the operational repository for the Inspire production servers. It contains modifications and addition to the Invenio core modules that does not fit into the normal Inspire overlay repository.
How to obtain Inspire sources
How to obtain Inspire-specific sources:
$ git clone http://invenio-software.org/repo/inspire
How to obtain Inspire-operations sources:
$ git clone http://invenio-software.org/repo/invenio-inspire-ops
How to install Inspire test site
INSPIRE sources are to be installed after the Invenio sources. You can customize some things in config-local.mk, for example on
Debian GNU/Linux, the standard apache user is www-data:
$ cat config-local.mk
INSTALL = install -g www-data -m 775
(Replace www-data with your own if you have configured it so.)
Any required Python packages can be installed by running the following command inside the root inspire source folder:
$ pip install -r requirements.txt
To install a small INSPIRE demo site from scratch, you can use:
- `inspire-recreate-demo-site' helper devscript, see instructions at [https://github.com/tiborsimko/inspire-devscripts#installation]
or
- alternatively, you can follow the steps below one-by-one.
Step-by-step guide
This is a step-by-step guide to install Inspire sources including demo records.
1. If you do not already have Invenio installed, build a clean (no test records) Invenio demo site from Invenio git sources,
following the
InspireInvenioInstallation guide, such as:
$ git clone http://invenio-software.org/repo/invenio
$ cd invenio
$ aclocal-1.9
$ automake-1.9 -a
$ autoconf
$ ./configure
$ make
$ make install
$ inveniocfg --create-tables \
--create-demo-site \
--yes-i-know
Or, if you already have an Invenio demo site up and running, then
just clean away its demo records:
$ inveniocfg --drop-demo-site \
--create-demo-site \
--yes-i-know
2. For Inspire you need to modify the standard
invenio-local.conf
to activate some special options.
$ cat /opt/invenio/etc/invenio-local.conf
[Invenio]
### BEGIN-customize-me
CFG_SITE_URL = http://localhost
CFG_SITE_SECURE_URL = https://localhost
CFG_SITE_SUPPORT_EMAIL = root@localhost
CFG_SITE_ADMIN_EMAIL = root@localhost
CFG_WEBALERT_ALERT_ENGINE_EMAIL = root@localhost
CFG_WEBCOMMENT_ALERT_ENGINE_EMAIL = root@localhost
CFG_WEBCOMMENT_DEFAULT_MODERATOR = root@localhost
CFG_WEBALERT_ALERT_ENGINE_EMAIL = root@localhost
CFG_BIBAUTHORID_AUTHOR_TICKET_ADMIN_EMAIL = root@localhost
#CFG_DATABASE_HOST = localhost
#CFG_DATABASE_PORT = 3306
#CFG_DATABASE_NAME = cdsinvenio
#CFG_DATABASE_USER = cdsinvenio
#CFG_DATABASE_PASS = my123p$ss
#CFG_BIBDOCFILE_USE_XSENDFILE = 1
### END-customize-me
CFG_SITE_LANG = en
CFG_SITE_LANGS = bg,ca,de,el,en,es,fr,hr,it,ja,no,pl,pt,ru,sk,sv,zh_CN,zh_TW
CFG_SITE_NAME = HEP
CFG_SITE_NAME_INTL_en = HEP
CFG_SITE_NAME_INTL_fr = HEP
CFG_SITE_NAME_INTL_de = HEP
CFG_SITE_NAME_INTL_es = HEP
CFG_SITE_NAME_INTL_ca = HEP
CFG_SITE_NAME_INTL_pt = HEP
CFG_SITE_NAME_INTL_it = HEP
CFG_SITE_NAME_INTL_ru = HEP
CFG_SITE_NAME_INTL_sk = HEP
CFG_SITE_NAME_INTL_cs = HEP
CFG_SITE_NAME_INTL_no = HEP
CFG_SITE_NAME_INTL_sv = HEP
CFG_SITE_NAME_INTL_el = HEP
CFG_SITE_NAME_INTL_uk = HEP
CFG_SITE_NAME_INTL_ja = HEP
CFG_SITE_NAME_INTL_pl = HEP
CFG_SITE_NAME_INTL_bg = HEP
CFG_SITE_NAME_INTL_hr = HEP
CFG_SITE_NAME_INTL_zh_CN = HEP
CFG_SITE_NAME_INTL_zh_TW = HEP
CFG_BIBINDEX_FULLTEXT_INDEX_LOCAL_FILES_ONLY = 1
CFG_WEBSTYLE_TEMPLATE_SKIN = inspire
CFG_WEBSEARCH_INSTANT_BROWSE = 0
CFG_WEBSEARCH_SPLIT_BY_COLLECTION = 0
CFG_INSPIRE_SITE = 1
CFG_ACCESS_CONTROL_LEVEL_ACCOUNTS = 5
CFG_WEBCOMMENT_ALLOW_COMMENTS = 0
CFG_WEBCOMMENT_ALLOW_REVIEWS = 0
CFG_WEBCOMMENT_ALLOW_SHORT_REVIEWS = 0
CFG_WEBSEARCH_DEF_RECORDS_IN_GROUPS = 25
CFG_WEBSEARCH_NB_RECORDS_TO_SORT = 5000
CFG_WEBSEARCH_USE_MATHJAX_FOR_FORMATS = hb,hd
CFG_WEBSUBMIT_FILESYSTEM_BIBDOC_GROUP_LIMIT = 20000
CFG_BIBUPLOAD_FFT_ALLOWED_LOCAL_PATHS = /tmp,/afs/cern.ch/project/inspire
CFG_WEBSEARCH_FULLTEXT_SNIPPETS = 5
CFG_WEBSEARCH_FULLTEXT_SNIPPETS_WORDS = 10
CFG_WEBSEARCH_FIELDS_CONVERT = {'eprint':'reportnumber','bb':'reportnumber',
'bbn':'reportnumber','bull':'reportnumber',
'r':'reportnumber','rn':'reportnumber',
'cn':'collaboration','a':'author',
'au':'author','name':'author',
'ea':'exactauthor','exp':'experiment',
'expno':'experiment','sd':'experiment',
'se':'experiment','j':'journal',
'kw':'keyword', 'keywords':'keyword',
'k':'keyword', 'au':'author', 'ti':'title',
't':'title', 'irn':'970__a',
'institution':'affiliation',
'inst':'affiliation', 'affil':'affiliation',
'aff':'affiliation', 'af':'affiliation',
'topic':'695__a','tp':'695__a','dk':'695__a',
'date':'year','d':'year','date-added':'datecreated',
'da':'datecreated','dadd':'datecreated',
'date-updated':'datemodified','dupd':'datemodified',
'du':'datemodified','vol':'volume',}
CFG_BIBRANK_SHOW_DOWNLOAD_GRAPHS_CLIENT_IP_DISTRIBUTION = 0
CFG_BIBRANK_SHOW_DOWNLOAD_GRAPHS = 0
CFG_BIBRANK_SHOW_DOWNLOAD_STATS = 0
CFG_BIBINDEX_AUTHOR_WORD_INDEX_EXCLUDE_FIRST_NAMES = True
CFG_WEBSEARCH_SEARCH_CACHE_SIZE = 0
CFG_WEBSTYLE_HTTP_STATUS_ALERT_LIST = 400,5*,41*
CFG_WEBSEARCH_SYNONYM_KBRS = {
'journal': ['JOURNALS', 'leading_to_comma'],
'collection': ['COLLECTION', 'exact'],
'subject': ['SUBJECT', 'exact'],
}
CFG_BIBEDIT_QUEUE_CHECK_METHOD = regexp
CFG_WEBSEARCH_SPIRES_SYNTAX = 9
CFG_PLOTEXTRACTOR_SOURCE_BASE_URL = http://export.arxiv.org/
CFG_SELFCITES_USE_BIBAUTHORID = 1
CFG_SELFCITES_PRECOMPUTE_FRIENDS = 0
CFG_WEBSEARCH_DISPLAY_NEAREST_TERMS = 0
#CFG_REFEXTRACT_TICKET_QUEUE = Inspire-References
#CFG_REFEXTRACT_KBS_OVERRIDE = {'journals': 'kb:docextract-journals'}
CFG_BIBUPLOAD_STRONG_TAGS = 999,084
CFG_BIBFORMAT_DISABLE_I18N_FOR_CACHED_FORMATS = ha,hb,hs,hx,hdref
CFG_WEBAUTHORPROFILE_USE_BIBAUTHORID = 1
CFG_BIBDOCFILE_ENABLE_BIBDOCFSINFO_CACHE = 1
CFG_WEBSEARCH_CITESUMMARY_SELFCITES_THRESHOLD = 0
CFG_BASE_URL =
CFG_BIBDOCFILE_DOCUMENT_FILE_MANAGER_DOCTYPES = [('INSPIRE-PUBLIC', 'INSPIRE-PUBLIC'),
('arXiv', 'arXiv'),
('Springer','Springer'),
('JHEP','JHEP'),
('Hindawi','Hindawi'),
('APS','APS'),
('Plot','Plot'),
('PlotMisc', 'PlotMisc'),
('Supplementary Material','Supplementary Material'),
('Data','Data')]
CFG_BIBFORMAT_CACHED_FORMATS =
Remember to update the configuration:
$ inveniocfg --update-all
3. Apply Inspire-specific code and templates from Inspire sources:
$ cd ..
$ git clone http://invenio-software.org/repo/inspire
$ cd inspire
$ vim config.mk ## verify that PREFIX matches the above --prefix and that INSTALL is correct
$ make install
4. Apply INSPIRE specific DB configurations and content.
$ make install-dbchanges
If you are prompted for a username, type admin. To avoid typing this you can add the global environmental variable in your
.bashrc
:
$ export CFG_INSPIRE_BIBTASK_USER=admin
5. Load test records from SPIRES:
$ make load-demo-records
$ bibsched ## run any waiting bibupload tasks
$ bibindex -u admin -s5m ## index records
$ bibrank -u admin -s5m ## rank records -- needed e.g. for citations
$ webcoll -u admin -s5m ## regenerate search collection cache
How to run reference extraction without installing Invenio
If you want to run the reference extraction script without having to
do the full Invenio installation, you can follow these steps:
## Download Git sources of Invenio:
$ git clone http://invenio-software.org/repo/invenio
## Download a test article:
$ wget -O /tmp/z.pdf http://doc.cern.ch/archive/electronic/hep-th/0101/0101001.pdf
## You have to go to the refextract folder in order to run it in a standalone mode:
$ cd invenio/modules/bibedit/lib
## Run refextract with a fake record ID of 123 (say):
$ python ./refextract.py 123:/tmp/z.pdf
<?xml version="1.0" encoding="UTF-8"?>
<collection xmlns="http://www.loc.gov/MARC21/slim">
<record>
<controlfield tag="001">123</controlfield>
<datafield tag="999" ind1="C" ind2="5">
<subfield code="o">[1]</subfield>
</datafield>
<datafield tag="999" ind1="C" ind2="5">
<subfield code="m">T. Banks, W. Fischler, S. Shenker, L. Susskind "M Theory Ann. Sci. a Matrix Model a Conjecture,"</subfield>
<subfield code="s">Phys. Rev., D 55 (1997) 5112</subfield>
<subfield code="r">hep-th/9610043</subfield>
</datafield>
[...]
Note that running refextract in a standalone mode means that there is
no checking of the location of required tools such as pdftotext. This
location is hardcoded in the beginning of the refextract.py script.