Some details how BibMatch is currently used at DESY

The code is standard.

Config file

Some settings in invenio-local.conf.txt

FUZZY_WORDLIMITS set to

N=2 for authors (not used yet) hardcoded use 1st_author and one of secondary authors.

N=2 for title resulting in
use N+1 longest words out of the original title search,
combine searches for all pairs of N words. Example: N=2 -> 3 words in total: W1 W2 W3.
(title:W1 title:W2) or (title:W1 title:W3) or (title:W2 title:W3)

VALIDATION_RULESETS

by default title is not in the validator, found to be too picky

Validate report number, DOI and authors in lazy mode (i.e. one match yields OK)

Validator for authors is important since the search is only for lastnames.
Only the validator uses firstnames.

If there is a report number or DOI
validate only this field + authors

SEARCH_RESULT_MATCH_LIMIT

set to 40

I.e. if the search gives more than 40 results before the validator, no match is returned

Search queries

the standard author + title query uses all authors (lastname only) and the first 3 words of the title (that are longer than 3 letters, no punktuation, no numbers, not 'with', 'from', 'erratum', 'addendum', 'publisher.s note')

If the new record has a CNUM use cnum.config search for:
arXiv-ID
DOI
1st_author and CNUM
standard author + title query

Default is allauth.config search for:
arXiv-ID
DOI
standard author + title query

Topic attachments
I Attachment History Action Size Date Who Comment
PDFpdf BibMatch.pdf r1 manage 34.8 K 2014-01-22 - 13:51 KirstenSachs Slides from 21/1/2014
Unknown file formatconfig allauth.config r1 manage 0.4 K 2013-08-21 - 10:02 KirstenSachs  
Unknown file formatconfig cnum.config r2 r1 manage 0.5 K 2016-10-25 - 11:44 KirstenSachs  
Unknown file formatconf invenio-local.conf r1 manage 14.1 K 2013-08-21 - 09:52 KirstenSachs  
Texttxt invenio-local.conf.txt r1 manage 10.8 K 2013-08-21 - 10:29 KirstenSachs  
Edit | Attach | Watch | Print version | History: r3 < r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r3 - 2016-10-25 - KirstenSachs
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Inspire All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2023 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback