Introduction and Overview

To add publisher informations on existing or new articles to INSPIRE we need to translate these informations into our MARC xml. If we are lucky and can get the informations as xml as well, the most convenient way is to use an xslt for the translation. Still, different publishers use different schemes requiring different xslts. On the other hand, a lot of translation work is always the same (dividing the authors on 100 and 700, getting 999C5s into the right form, ...). Therefore, I designed a simple intermediate xml-scheme with human readable tags to which the publisher xml is translated. A universal xslt translates the intermediate to the final (INSPIRE) xml.

Files

  • intermediate-1.24.xsd: scheme for intermediate xml
  • intermediate-1.24_example.xml: a dummy example
  • intermediate-1.24.xslt: xslt to translate the intermediate to the final (INSPIRE) xml
  • intermediate-1.24.mfd: Altova-file used to create intermediate-1.24.xslt
  • 10.1056_564564.xml: translation of dummy example specific publisher example:
specific publisher example:
  • oxford-1.24.xslt: xslt to translate xml from OUP (PTEP) to intermediate scheme
  • oxford-1.24.mfd: Altova-file used to create oxford-1.24.xslt

Description

intermediate-1.24.xslt does a couple of things (beside 1:1 mapping)

  • deviding authors and editors on 100 (first author) and 700 (rest)
  • writing 041 (language) only if unequal English
  • trying to calculate 300 (number of pages) from page range (773__c)
  • CC-licence: it's enough to give either licence code or URL
  • bringing 999C5s into the right form using a look-up table for the journal names which is generated by hand from INSPIRE.Journals and copied into intermediate-1.24.xslt
  • bringing 999C5r into the right form for arXiv-numbers
  • bringing 999C5h into the right form concatenating authors
  • if there is a free text reference it is written to 999C5m only if not DOI, pubnote etc. is found
  • translating field- and type-codes from short (SPIRES) to long (INSPIRE)
oxford-1.24.xslt has two additional features (added by hand) which are not the altova file
  • lookup table for publisher keys to publisher keywords
  • reference are either in <nlm-citation> or <citation>, I changed the according <xsl:for-each select="nlm-citation"> by hand instead of drawing a few dozen lines in the altova file again

Open Questions

  • at the moment arXiv-number is written to 037, should there be an entry in 035 as well?
  • in FFT: are 'INSPIRE-HIDDEN' and 'INSPIRE-PUBLIC' right?
  • using look-up table also for 773? Doing the journal name translation with a different tool separately?
  • which tags are missing in intermediate-1.24.xsd?
  • which steps are missing in intermediate-1.24.xslt?
Topic attachments
I Attachment History Action Size Date Who Comment
XMLxml 10.1056_564564.xml r1 manage 7.2 K 2013-07-17 - 16:57 FlorianSchwennsen  
Unknown file formatmfd intermediate-1.24.mfd r1 manage 186.2 K 2013-07-17 - 16:57 FlorianSchwennsen  
Unknown file formatxsd intermediate-1.24.xsd r1 manage 26.7 K 2013-07-17 - 16:56 FlorianSchwennsen  
Unknown file formatxslt intermediate-1.24.xslt r1 manage 1551.6 K 2013-07-17 - 16:56 FlorianSchwennsen  
XMLxml intermediate-1.24_example.xml r1 manage 4.6 K 2013-07-17 - 16:56 FlorianSchwennsen  
Unknown file formatmfd oxford-1.24.mfd r1 manage 105.5 K 2013-07-23 - 13:57 FlorianSchwennsen  
Unknown file formatxslt oxford-1.24.xslt r2 r1 manage 98.9 K 2013-09-03 - 09:38 FlorianSchwennsen  
Edit | Attach | Watch | Print version | History: r5 < r4 < r3 < r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r5 - 2013-09-03 - FlorianSchwennsen
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Inspire All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2023 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback