Record structure comparison between SPIRES and Invenio
Contents:
SPIRES
Bare SPIRES record (not comprehensive)
As promised here is a "bare" SPIRES record selected (somewhat) randomly just to give an idea:
IRN = 3418758;
DOC-TYPE = Preprint;
REPORT-NUM = SLAC-PUB-9865;
ASTR;
AUTHOR = Brooks, T.C.;
AUTHOR = Convery, M.E.;
AUTHOR = Davis, W.L.;
AUTHOR = DelSignore, K.W.;
AUTHOR = Jenkins, T.L.;
AUTHOR = Kangas, E.;
AUTHOR = Knepley, M.G.;
AUTHOR = Kowalski, K.L.;
AUTHOR = Taylor, C.C.;
AFFILIATION = Case Western Reserve U.;
ASTR;
AUTHOR = Oh, S.H.;
AUTHOR = Walker, W.D.;
AFFILIATION = Duke U.;
ASTR;
AUTHOR = Colestock, P.L.;
AUTHOR = Hanna, B.;
AUTHOR = Martens, M.;
AUTHOR = Steets, J.;
AFFILIATION = Fermilab;
ASTR;
AUTHOR = Ball, R.;
AUTHOR = Gustafson, H.R.;
AUTHOR = Jones, L.W.;
AUTHOR = Longo, M.J.;
AFFILIATION = Michigan U.;
ASTR;
AUTHOR = Bjorken, J.D.;
AFFILIATION = SLAC;
ASTR;
AUTHOR = Abashian, A.;
AUTHOR = Morgan, N.;
AFFILIATION = Virginia Tech.;
ASTR;
AUTHOR = Pruneau, C.A.;
AFFILIATION = Wayne State U.;
COL-NOTE = MiniMax Collaboration;
TITLE = Analysis of charged particle / photon correlations in hadronic multiparticle production;
PUB-NOTE = Phys.Rev.D55:5667-5680,1997;
SLAC-TOPICS = SLAC, there, 09/96;
DATE = Sep 1996;
JOUR-SUB = Phys.Rev.D;
PPF-SUBJECT = Experimental, S;
P = 35;
PPA = 9716;
PPF = 9639;
CITATION = PHLTA,B217,169;
CITATION = PHLTA,B266,482;
CITATION = IMPAE,A7,4189;
CITATION = APPOA,B23,561;
CITATION = PHRVA,D46,246;
CITATION = HEP-PH 9211282;
CITATION = NUPHA,B399,395;
CITATION = PHRVA,D51,2482;
CITATION = HEP-PH 9411329;
CITATION = HEP-PH 9501210;
CITATION = PRLTA,72,970;
CITATION = RPPHA,58,611;
CITATION = JTPLA,33,67;
CITATION = APNYA,66,509;
CITATION = IMPAE,A2,1447;
CITATION = IMPAE,A4,1527;
CITATION = MPLAE,A8,2747;
CITATION = JTPLA,59,585;
CITATION = PHRVA,D49,5805;
CITATION = HEP-PH 9503325;
CITATION = PHRVA,D9,3113;
CITATION = NUIMA,138,241;
CITATION = NUIMA,140,533;
CITATION = PHLTA,B206,707;
CITATION = ZEPYA,C43,75;
CITATION = HEP-PH 9309235;
CITATION = BAPSA,41,902;
CITATION = BAPSA,41,938;
CITATION = PHRVA,D50,6811;
CITATION = PRPLC,65,151;
CITATION = NUPHA,B370,365;
CITATION = PHRVA,D48,5;
CITATION = CPHCB,46,43;
EXPERIMENT = FNAL-E-0864;
PPFIN-ACCT = LIRYG;
DESY-KEYWORDS = data analysis method;
DESY-KEYWORDS = hadron hadron, interaction;
DESY-KEYWORDS = anti-p p, annihilation;
DESY-KEYWORDS = multiple production;
DESY-KEYWORDS = charged particle, hadroproduction;
DESY-KEYWORDS = photon, associated production;
DESY-KEYWORDS = pi, charged particle;
DESY-KEYWORDS = pi0;
DESY-KEYWORDS = multiplicity, moment;
DESY-KEYWORDS = symmetry, chiral;
DESY-KEYWORDS = critical phenomena;
DESY-KEYWORDS = pi, condensation;
DESY-KEYWORDS = numerical calculations, Monte Carlo;
DESY-ABS-NUM = D96-20944;
DESY-CLASS-CODE = G;
DESY-CLASS-CODE = D;
DATE-UPDATED = 07/08/2004;
ACCOUNT-UPD = LI.KAL;
DATE-ADDED = 09/18/1996;
ACCT-ADDED = CITES;
TT = Analysis of charged-particle/photon correlations in hadronic multiparticle production.;
BULL = HEPPH-9609375;
URL = ADSABS;
URLDOC = 1997PhRvD..55.5667B;
URL = PHRVA-D;
URLDOC = V55/P05667/;
URL = SLACPUB;
URLDOC = 9865;
TIME-UPD = 11:47:06;
PACS = 13.87.Ce;
PACS = 14.40.Aq;
PACS = 14.70.Bh;
NEW-DESY-CHECK = Preprint - Brooks, T.C. (rec.Sep.96) 35 p.;
CATDATE = 09/20/1996;
CATTIME = 10:44:34;
DOI = 10.1103/PhysRevD.55.5667;
Notes
--
TravisBrooks - 18 Jun 2007
XML SPIRES Record
<goal_record>
<irn>3418758</irn>
<doc-type>Preprint</doc-type>
<report-num>SLAC-PUB-9865</report-num>
<astr>
<astr1>
<author>Brooks, T.C.</author>
</astr1>
<astr1>
<author>Convery, M.E.</author>
</astr1>
<astr1>
<author>Davis, W.L.</author>
</astr1>
<astr1>
<author>DelSignore, K.W.</author>
</astr1>
<astr1>
<author>Jenkins, T.L.</author>
</astr1>
<astr1>
<author>Kangas, E.</author>
</astr1>
<astr1>
<author>Knepley, M.G.</author>
</astr1>
<astr1>
<author>Kowalski, K.L.</author>
</astr1>
<astr1>
<author>Taylor, C.C.</author>
</astr1>
<affiliation>Case Western Reserve U.</affiliation>
</astr>
<astr>
<astr1>
<author>Oh, S.H.</author>
</astr1>
<astr1>
<author>Walker, W.D.</author>
</astr1>
<affiliation>Duke U.</affiliation>
</astr>
<astr>
<astr1>
<author>Colestock, P.L.</author>
</astr1>
<astr1>
<author>Hanna, B.</author>
</astr1>
<astr1>
<author>Martens, M.</author>
</astr1>
<astr1>
<author>Steets, J.</author>
</astr1>
<affiliation>Fermilab</affiliation>
</astr>
<astr>
<astr1>
<author>Ball, R.</author>
</astr1>
<astr1>
<author>Gustafson, H.R.</author>
</astr1>
<astr1>
<author>Jones, L.W.</author>
</astr1>
<astr1>
<author>Longo, M.J.</author>
</astr1>
<affiliation>Michigan U.</affiliation>
</astr>
<astr>
<astr1>
<author>Bjorken, J.D.</author>
</astr1>
<affiliation>SLAC</affiliation>
</astr>
<astr>
<astr1>
<author>Abashian, A.</author>
</astr1>
<astr1>
<author>Morgan, N.</author>
</astr1>
<affiliation>Virginia Tech.</affiliation>
</astr>
<astr>
<astr1>
<author>Pruneau, C.A.</author>
</astr1>
<affiliation>Wayne State U.</affiliation>
</astr>
<col-note>MiniMax Collaboration</col-note>
<title>Analysis of charged particle / photon correlations in hadronic
multiparticle production</title>
<pub-note>Phys.Rev.D55:5667-5680,1997</pub-note>
<slac-topics>SLAC, there, 09/96</slac-topics>
<date>Sep 1996</date>
<jour-sub>Phys.Rev.D</jour-sub>
<ppf-subject>Experimental, S</ppf-subject>
<p>35</p>
<ppa>9716</ppa>
<ppf>9639</ppf>
<citation>PHLTA,B217,169</citation>
<citation>PHLTA,B266,482</citation>
<citation>IMPAE,A7,4189</citation>
<citation>APPOA,B23,561</citation>
<citation>PHRVA,D46,246</citation>
<citation>HEP-PH 9211282</citation>
<citation>NUPHA,B399,395</citation>
<citation>PHRVA,D51,2482</citation>
<citation>HEP-PH 9411329</citation>
<citation>HEP-PH 9501210</citation>
<citation>PRLTA,72,970</citation>
<citation>RPPHA,58,611</citation>
<citation>JTPLA,33,67</citation>
<citation>APNYA,66,509</citation>
<citation>IMPAE,A2,1447</citation>
<citation>IMPAE,A4,1527</citation>
<citation>MPLAE,A8,2747</citation>
<citation>JTPLA,59,585</citation>
<citation>PHRVA,D49,5805</citation>
<citation>HEP-PH 9503325</citation>
<citation>PHRVA,D9,3113</citation>
<citation>NUIMA,138,241</citation>
<citation>NUIMA,140,533</citation>
<citation>PHLTA,B206,707</citation>
<citation>ZEPYA,C43,75</citation>
<citation>HEP-PH 9309235</citation>
<citation>BAPSA,41,902</citation>
<citation>BAPSA,41,938</citation>
<citation>PHRVA,D50,6811</citation>
<citation>PRPLC,65,151</citation>
<citation>NUPHA,B370,365</citation>
<citation>PHRVA,D48,5</citation>
<citation>CPHCB,46,43</citation>
<experiment>FNAL-E-0864</experiment>
<ppfin-acct>LIRYG</ppfin-acct>
<desy-keywords>data analysis method</desy-keywords>
<desy-keywords>hadron hadron, interaction</desy-keywords>
<desy-keywords>anti-p p, annihilation</desy-keywords>
<desy-keywords>multiple production</desy-keywords>
<desy-keywords>charged particle, hadroproduction</desy-keywords>
<desy-keywords>photon, associated production</desy-keywords>
<desy-keywords>pi, charged particle</desy-keywords>
<desy-keywords>pi0</desy-keywords>
<desy-keywords>multiplicity, moment</desy-keywords>
<desy-keywords>symmetry, chiral</desy-keywords>
<desy-keywords>critical phenomena</desy-keywords>
<desy-keywords>pi, condensation</desy-keywords>
<desy-keywords>numerical calculations, Monte Carlo</desy-keywords>
<desy-abs-num>D96-20944</desy-abs-num>
<desy-class-code>G</desy-class-code>
<desy-class-code>D</desy-class-code>
<date-updated>07/08/2004</date-updated>
<account-upd>LI.KAL</account-upd>
<date-added>09/18/1996</date-added>
<acct-added>CITES</acct-added>
<tt>Analysis of charged-particle/photon correlations in hadronic multiparticleproduction.</tt>
<bull>HEPPH-9609375</bull>
<url-str>
<url>ADSABS</url>
<urldoc>1997PhRvD..55.5667B</urldoc>
</url-str>
<url-str>
<url>PHRVA-D</url>
<urldoc>V55/P05667/</urldoc>
</url-str>
<url-str>
<url>SLACPUB</url>
<urldoc>9865</urldoc>
</url-str>
<time-upd>11:47:06</time-upd>
<pacs>13.87.Ce</pacs>
<pacs>14.40.Aq</pacs>
<pacs>14.70.Bh</pacs>
<new-desy-check>Preprint - Brooks, T.C. (rec.Sep.96) 35 p.</new-desy-check>
<catdate>09/20/1996</catdate>
<cattime>10:44:34</cattime>
<doi>10.1103/PhysRevD.55.5667</doi>
</goal_record>
- This is just the straightforward XML translation of the above record. The only oddity is the extra "astr1" surrounding the author elements which makes this a bit noisy. This is trivial for us to generate
- note to cognoscenti- I did this by set format $genform.xml displaying a hep record, this generates a format definition that I stored in formats as LI.TCB.PREPRINT.XML
- This does not include calculated elements, or information from associated databases (abstracts/pre-url/others) both of which eventually need to be considered.
- We should use this reocrd as a starting point of spires output, devise translations to import to Invenio, then see what else we need to include in this record or as subsidiary dbs.
Explanation of Non-intuitive SPIRES fields
- SLAC-Topics - if the document involved a SLAC author, this is filled in to help us track special things about these papers. This probably should have no analog in Inspire, but instead should be moved to our local repository or a connected local DB.
- PPF-Subject - a very brief subject classification including subject info as well as "S"=> Published. Now deprecated in favor of field code (16 values) and type code (8 values), but not all records have been translated yet. The values of FC and TC are available on our wiki and I can paste it in here eventually...
- P - pages (number of pages in paper- usually we use the eprint verison, which may be substantially different from published version)
- ppf - the YYWW (i.e. last 2 digits of year, and the "weeknumber" - 1-52) when the paper was input. Here input means that all human-assisted checks have been done. The appearance of this number serves as a sign that the record does not need these checks unless something else happens to them, and also serves as a workflow management tool (stats on papers input_/remaining, etc). There is also an email alerting service based on sending out records after _input but this is relatively unimportant.
- ppa - the YYWW that the paper obtained published information (i.e. it was a preprint, but now it is published). This generates another alerting service, but other than that, it is unimportant.
- DESY-CLASS-CODE - This is another category coding that will eventually be rolled into FC
- date/acct upd/add - these 4 elements track the most recent date and account that touched the record, as well as the original account and time for the record's creation
- cattime/date - these two elements refer to the date and time of human checks inputting as described above. Thus catdate and ppf are redundant information.
- NEW-DESY-CHECK - standardized rep nr + inst + date + date received + pages. to be matched against R? to be discontinued? (Annette)
- TT - tex-title This is the title as written by the author upon submission to arXiv.org It often contains tex markup. ( TT contains as well title variants from DESY. To be discontinued. Need to define clear common rules (Annette) ) The regular title is modified by SPIRES to attempt to standardise various words, abbreviations, symbols to make things more readable and consistent. We actually need 3 titles:
- a title that creates search terms (including words from old titles, common spelling changes, acronyms, etc.)
- a title the represents the way the author put it on arXiv/journal/etc
- a title that displays symbols correctly
- DESY-PUB-NOTE - since 2001 book info that does not fit into PBN or CPBN - where should it go? Before 2001 all pub info from DESY - needs to be merged into PBN + CPBN. (Annette)
- DESY-CHECK - complete bibl info from DESY used till 1996 (consisting of DPBN + NDCK). Need to check whether all info has been merged correctly into PBN, CPBN, R? (Annette)
Comprehensive SPIRES Record
Rather than retype fields in the table below (I started...) I will include below the full set of SPIRES elements in its own table - this is from the internal record documentation, modified slightly by me to fit in the table
- Elem type - Fixed/Optional/Required/Virtual
- opt means that it may or may not occur
- fix or req means it must occur
- vir means it is calculated at display time from other values in the record or elsewhere in the database(s) - these are moved to the end in a separate table unless they are in a structure in this main table
- Occ R(epeatable)/NR(non-repeatable)
- Element - the name of the element some elements are in structures that can be repeated. The structure is listed as an element, then sub-elements are listed with ":" in front so you can see they are in the structure listed above. In 2 cases (circ and abstracts, there are virtual links to all elements available from other databases (abstracts and circ...)
- Status O - obsolete or D - deprecated or S - Suspicious (probably a better way, but still in current use) All others are Current.
- Notes most notes are in the mapping table below, but some fields only appear here
Bibliographic elements
Elem type (see above) |
Occ |
Element |
Status |
Notes |
Fix |
NR |
IRN |
Fix |
NR |
DOC-TYPE |
D |
All records have this, but I don't know that it is used or maintained properly... |
Opt |
R |
REPORT-NUM |
Opt |
R |
Structure: ASTR |
Opt |
R |
Structure: :ASTR1 |
Req |
NR |
::AUTHOR |
Opt |
NR |
::DESY-AUTHOR |
S |
authors containing umlauts from DESY (Annette) |
Vir |
NR |
::AUTHOR-SORT |
S |
Calculated trivially from author -easy to do in other ways |
Opt |
R |
:AFFILIATION |
|
These must match authority file, institutions database |
Vir |
R |
:COUNTRY |
|
Calculated by looking up country information in inst. db |
Vir |
R |
:DESYLOOKUP |
|
Alt way of writing inst name calc from inst. db |
Opt |
R |
CORP-AUTHOR |
|
allows an entity to write paper, not personal name |
Opt |
R |
COL-NOTE |
Opt |
NR |
TITLE |
Opt |
R |
PUB-NOTE |
Opt |
R |
EXTRA-PUB-NOTE |
O |
|
Opt |
R |
SLAC-TOPICS |
S |
Slac coding for admin reasons |
Opt |
R |
DATE |
|
|
Opt |
NR |
JOUR-SUB |
|
|
Opt |
NR |
LANGUAGE |
|
Language it is written in |
Opt |
NR |
PPF-SUBJECT |
D |
ready to be translated to FC/TC |
Opt |
NR |
HOLDINGS |
S |
This is important...do we link to local holdings catalogs? |
Opt |
NR |
AV |
D |
Is it on Microfiche...? |
Opt |
NR |
P |
|
pages |
Opt |
R |
CITATION |
Opt |
R |
MEETING-NOTE |
S |
This currently contains conf. info, but should rather be part of a conf structure looking up auth. file |
Opt |
R |
NOTE |
|
Free form notes for display |
Opt |
R |
REPORT-CANCEL |
|
To allow a non-displaying record |
Opt |
NR |
TRANS-NOTE |
O?? |
??? |
Opt |
R |
SUBJECT-HEADING |
O? |
?? |
Opt |
R |
LIB-NEWS-CLASS |
O |
Library subject codes |
Opt |
R |
Structure: CPN |
|
Conf. info should replace Meeting Note and cnum |
Req |
NR |
:CONF-PUB-NOTE |
|
|
Opt |
NR |
:CALL-NUM |
D? |
Call num of proceeedings |
Opt |
R |
:CPBNX |
|
|
Opt |
NR |
:CONF-PUB-JOUR |
|
|
Opt |
R |
EXPERIMENT |
|
Authority file- experiments database |
Opt |
R |
TITLE-CHANGE-J |
D (use old title instead) |
Searched with title |
Opt |
R |
DESY-PUB-NOTE |
S |
Should be combined with PBN |
Opt |
R |
DESY-KEYWORDS |
|
|
Opt |
R |
SLAC-EXPERIMENT |
O? |
|
Opt |
R |
DESY-CLASS-CODE |
O |
to be mapped into FC (Annette) |
Opt |
NR |
DESY-CHECK |
O |
bibl info from DESY till 1996 (Annette) |
Opt |
R |
NEW-DESY-CHECK |
S |
preprint bibl info from DESY since 1997 (Annette) |
Opt |
R |
CALTECH-TAG |
O? |
?? |
Opt |
R |
ENERGYRANGE-CODE |
O? |
one-digit code for energy range of reactions (Annette) |
Opt |
R |
SLAC-DETECTOR |
O? |
?? |
Opt |
R |
TITLE-VARIANT |
S |
Searched with title, used for acronyms |
Opt |
R |
OTHER-AUTHOR |
S |
Searched with author |
Opt |
R |
TT |
|
Tex or arXiv title or DESY title variant |
Opt |
R |
BULL |
|
arXiv number |
Opt |
R |
DESYR |
S |
standardized rep nr from DESY - to be discontinued? (Annette) |
Opt |
R |
Structure: URL-STR |
Req |
NR |
:URL |
|
Key to lookup in URL list (pre-url) |
Opt |
R |
:URLDOC |
|
Record specific piece of URL |
Opt |
R |
:URLNOTE |
O |
??? |
Vir |
NR |
:TRUE-URL |
|
Calculated from URL, URLDOC, and info from pre-url file -> the actual location |
Opt |
R |
PR-STATUS |
O? |
??? |
Opt |
R |
PACS |
|
|
Opt |
R |
TOPCIT |
S |
Divides recs into broad citation categories for searching |
Opt |
R |
CONF-NUMBER |
S |
key to conferences file, should be part of cpn... |
Opt |
R |
OLD-TITLE |
|
Searched with title, used for title changes |
Opt |
R |
CERNKEY |
|
|
Opt |
R |
FIELD-CODE |
|
|
Opt |
R |
TYPE-CODE |
|
|
Opt |
R |
FREE-KEYWORDS |
|
Used for non-controlled vocab keywords (i.e. author supplied) |
Opt |
R |
DOI |
|
|
Opt |
R |
TEXKEY |
|
key of the record in latex cite format not yet used, but niportant for latex users |
Workflow elements
Elem type (see above) |
Occ |
Element |
Status |
Notes |
Opt |
R |
DATE-RECEIVED |
O |
used only if DATE unknown - basically equivalent to DATE-ADDED? (Annette) |
Opt |
R |
SLAC-DIST |
O |
slac-specific coding |
Opt |
NR |
PPA |
|
YYMM of journal addition |
Opt |
R |
PPF |
|
YYMM of checking of record |
Opt |
R |
PPFIN-ACCT |
person who checked record |
Opt |
R |
DESY-ABS-NUM |
|
DESY's unique document id (after keywording) (Annette) |
Opt |
R |
PDGSC |
|
??? |
Opt |
R |
SLAC-DATE |
D |
Date of SLAC registration of record (used in SLAC file instead) |
Opt |
R |
HIDDEN-NOTE |
|
Internal notes |
Opt |
R |
CATDATE |
|
Date of checking |
Opt |
R |
CATTIME |
|
Time of checking |
Opt |
R |
TIME-UPD |
|
|
Opt |
R |
DATE-ADDED |
|
|
Opt |
R |
ACCT-ADDED |
Opt |
R |
OLDPPA |
??? |
Opt |
R |
XDRN |
|
DESY mark/ref nr, removed when keyworded, useful for checking hep relevance (Annette) |
Opt |
R |
DATE-UPDATED |
|
|
Opt |
R |
ACCOUNT-UPD |
Opt |
R |
Structure: ORDER-STR |
O |
Ordering materials??? |
Req |
NR |
:SOURCE |
Opt |
R |
:ORDER-DATE |
Opt |
R |
:REQUESTER |
|
Opt |
R |
:ORDER-NOTE |
Opt |
R |
:COST |
Opt |
R |
:ORDER-PLACED-BY |
Virtual (calculated) elements
Elem type (see above) |
Occ |
Element |
Status |
Notes |
Vir |
NR |
DATE-SORT |
|
falls through date, dateadded, etc until gets a date - guarantees a date and puts it in numeric sortable form |
Vir |
NR |
JOURNAL-YEAR |
|
year of jour pub (from pbn or dpbn) |
Vir |
NR |
REPORTNO-SORT |
|
Vir |
NR |
SLAC-REPORTNO |
D |
rept nums that begin with SLAC |
Vir |
NR |
PUB-CATEGORY |
|
?? |
Vir |
NR |
FIRST-AUTHOR |
|
|
Vir |
NR |
FIRST-AUTHORSORT |
Vir |
NR |
JINDEX |
|
jnl + vol (Annette) |
Vir |
R |
LA-URL |
|
arXiv url (Annette) |
Vir |
R |
Structure: phan CIRC-STR (Subfile CIRC) |
Access to the circulation records for this object at SLAC |
Vir |
R |
SLAC-EXP |
?? |
Vir |
R |
LANL-NUMBER |
|
arXiv number in real form (bull is currently screwed up, not the actual form arXiv used, fixing this RSN...) |
Vir |
R |
ECONF-TITLE |
D |
econf related stuff |
Vir |
NR |
YEAR |
|
from date |
Vir |
R |
SPICITE |
D |
the way it might appear in our citations calc from PBN should use Mycite instead (relies on this, but only internally) |
Vir |
R |
SPICITE2 |
D |
?? |
Vir |
R |
SPICITE3 |
D |
Same as spicite, but from dpbn |
Vir |
R |
BBADDRESS |
O |
holdover from when we munged the arXiv ids |
Vir |
R |
BBDESCRIP |
O |
ditto |
Vir |
R |
CITECODEN |
O? |
??? |
Vir |
R |
GETCONF |
S |
Fetches meeting info from conferences using CNUm not used as it should be |
Vir |
NR |
ALL-JOUR |
S |
similar to spicite, but includes year and cobmines pbn and dpbn |
Vir |
NR |
PBN_DPBN_SORT |
D |
|
Vir |
NR |
ALL-JOURTITLE |
S |
|
Vir |
NR |
JOURNAL-PAGE |
|
Page of journal, from spicite/pbn/dpbn |
Vir |
R |
SPICITE4 |
|
??? |
Vir |
NR |
BOX |
S |
SLAC specific location in storage (from "Storage" ) |
Vir |
NR |
AUTHCOUNT |
|
number of authors- convenient from Authors |
Vir |
NR |
CITECOUNT |
|
number of papers citing this one, complicated, but very important calculation, from citation of other records |
Vir |
NR |
DISPJ |
|
preferred way to display journal information |
Vir |
NR |
MYCITE |
|
The various objects this paper might be cited by in (SPIRES) ref. lists. from spicite, lanl, report, spicite3 |
Vir |
NR |
CITEFORM |
|
the various ways that the above objects might appear (i.e. with or without a volume letter, etc etc. |
Vir |
R |
MYCITES |
S |
Mycite, broken up into repeating elements. |
Vir |
R |
PRIMARCH |
|
The main arXiv category for the paepr (not crosslisted) - from abstracts file |
Vir |
R |
ARCH |
|
arXiv categories incl cross listings (Annette) |
Vir |
R |
Structure: phan ABSTRACTS (SubfileABSTRACTS) |
|
access to the abstracts file which includes arXiv information |
Vir |
R |
BOTHREFS |
|
The references of this paper, listed as both eprint refs and journals (easier to check for dupes/accuracy/etc) |
Invenio
Record structure documentation
Record structure examples
1. For any record in
CDS
, the internal MARCXML format is available as an output format option via the detailed record page. For example, search for
hep-th/0102003
, click on
proposed detailed record
link, then on
MARCXML
output format in order to inspect it. This functionality is available for any record in CDS.
2. To describe MARC markup, one usually uses two notations:
a) human-friendly:
100 $a Ellis, John $e editor
b) machine-friendly (MARCXML):
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Ellis, John</subfield>
<subfield code="e">editor</subfield>
</datafield>
In the examples below I'll use the MARCXML notation.
3. The above SPIRES test record in MARCXML would be:
<record>
<!-- IRN = 3418758; -->
<controlfield tag="001">3418758</controlfield>
<!-- DOC-TYPE = Preprint; -->
<datafield tag="690" ind1="C" ind2=" ">
<subfield code="a">PREPRINT</subfield>
</datafield>
<!-- REPORT-NUM = SLAC-PUB-9865; -->
<datafield tag="037" ind1=" " ind2=" ">
<subfield code="a">SLAC-PUB-9865</subfield>
</datafield>
<!-- ASTR;
AUTHOR = Brooks, T.C.;
AUTHOR = Convery, M.E.;
AUTHOR = Davis, W.L.;
AUTHOR = DelSignore, K.W.;
AUTHOR = Jenkins, T.L.;
AUTHOR = Kangas, E.;
AUTHOR = Knepley, M.G.;
AUTHOR = Kowalski, K.L.;
AUTHOR = Taylor, C.C.;
AFFILIATION = Case Western Reserve U.;
ASTR;
AUTHOR = Oh, S.H.;
AUTHOR = Walker, W.D.;
AFFILIATION = Duke U.;
ASTR;
AUTHOR = Colestock, P.L.;
AUTHOR = Hanna, B.;
AUTHOR = Martens, M.;
AUTHOR = Steets, J.;
AFFILIATION = Fermilab;
ASTR;
AUTHOR = Ball, R.;
AUTHOR = Gustafson, H.R.;
AUTHOR = Jones, L.W.;
AUTHOR = Longo, M.J.;
AFFILIATION = Michigan U.;
ASTR;
AUTHOR = Bjorken, J.D.;
AFFILIATION = SLAC;
ASTR;
AUTHOR = Abashian, A.;
AUTHOR = Morgan, N.;
AFFILIATION = Virginia Tech.;
ASTR;
AUTHOR = Pruneau, C.A.;
AFFILIATION = Wayne State U.;
-->
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Brooks, T.C.</subfield>
<subfield code="u">Case Western Reserve U.</subfield>
</datafield>
<datafield tag="700" ind1=" " ind2=" ">
<subfield code="a">Convery, M.E.</subfield>
<subfield code="u">Case Western Reserve U.</subfield>
</datafield>
<datafield tag="700" ind1=" " ind2=" ">
<subfield code="a">Davis, W.L.</subfield>
<subfield code="u">Case Western Reserve U.</subfield>
</datafield>
<datafield tag="700" ind1=" " ind2=" ">
<subfield code="a">DelSignore, K.W.</subfield>
<subfield code="u">Case Western Reserve U.</subfield>
</datafield>
<datafield tag="700" ind1=" " ind2=" ">
<subfield code="a">Jenkins, T.L.</subfield>
<subfield code="u">Case Western Reserve U.</subfield>
</datafield>
<datafield tag="700" ind1=" " ind2=" ">
<subfield code="a">Kangas, E.</subfield>
<subfield code="u">Case Western Reserve U.</subfield>
</datafield>
<datafield tag="700" ind1=" " ind2=" ">
<subfield code="a">Knepley, M.G.</subfield>
<subfield code="u">Case Western Reserve U.</subfield>
</datafield>
<datafield tag="700" ind1=" " ind2=" ">
<subfield code="a">Kowalski, K.L.</subfield>
<subfield code="u">Case Western Reserve U.</subfield>
</datafield>
<datafield tag="700" ind1=" " ind2=" ">
<subfield code="a">Taylor, C.C.</subfield>
<subfield code="u">Case Western Reserve U.</subfield>
</datafield>
<datafield tag="700" ind1=" " ind2=" ">
<subfield code="a">Oh, S.H.</subfield>
<subfield code="u">Duke U.</subfield>
</datafield>
<datafield tag="700" ind1=" " ind2=" ">
<subfield code="a">Walker, W.D.</subfield>
<subfield code="u">Duke U.</subfield>
</datafield>
<datafield tag="700" ind1=" " ind2=" ">
<subfield code="a">Colestock, P.L.</subfield>
<subfield code="u">Fermilab</subfield>
</datafield>
<datafield tag="700" ind1=" " ind2=" ">
<subfield code="a">Hanna, B.</subfield>
<subfield code="u">Fermilab</subfield>
</datafield>
<datafield tag="700" ind1=" " ind2=" ">
<subfield code="a">Martens, M.</subfield>
<subfield code="u">Fermilab</subfield>
</datafield>
<datafield tag="700" ind1=" " ind2=" ">
<subfield code="a">Steets, J.</subfield>
<subfield code="u">Fermilab</subfield>
</datafield>
<datafield tag="700" ind1=" " ind2=" ">
<subfield code="a">Ball, R.</subfield>
<subfield code="u">Michigan U.</subfield>
</datafield>
<datafield tag="700" ind1=" " ind2=" ">
<subfield code="a">Gustafson, H.R.</subfield>
<subfield code="u">Michigan U.</subfield>
</datafield>
<datafield tag="700" ind1=" " ind2=" ">
<subfield code="a">Jones, L.W.</subfield>
<subfield code="u">Michigan U.</subfield>
</datafield>
<datafield tag="700" ind1=" " ind2=" ">
<subfield code="a">Longo, M.J.</subfield>
<subfield code="u">Michigan U.</subfield>
</datafield>
<datafield tag="700" ind1=" " ind2=" ">
<subfield code="a">Bjorken, J.D.</subfield>
<subfield code="u">SLAC</subfield>
</datafield>
<datafield tag="700" ind1=" " ind2=" ">
<subfield code="a">Abashian, A.</subfield>
<subfield code="u">Virginia Tech.</subfield>
</datafield>
<datafield tag="700" ind1=" " ind2=" ">
<subfield code="a">Morgan, N.</subfield>
<subfield code="u">Virginia Tech.</subfield>
</datafield>
<datafield tag="700" ind1=" " ind2=" ">
<subfield code="a">Pruneau, C.A.</subfield>
<subfield code="u">Wayne State U.</subfield>
</datafield>
<!-- COL-NOTE = MiniMax Collaboration; -->
<datafield tag="710" ind1=" " ind2=" ">
<subfield code="g">MiniMax Collaboration</subfield>
</datafield>
<!-- TITLE = Analysis of charged particle / photon correlations in hadronic multiparticle production; -->
<datafield tag="245" ind1=" " ind2=" ">
<subfield code="a">Analysis of charged particle / photon correlations in hadronic multiparticle production</subfield>
</datafield>
<!-- PUB-NOTE = Phys.Rev.D55:5667-5680,1997; -->
<datafield tag="773" ind1=" " ind2=" ">
<subfield code="a">10.1103/PhysRevD.55.5667</subfield>
<subfield code="c">5667-5680</subfield>
<subfield code="p">Phys. Rev. D</subfield>
<subfield code="v">55</subfield>
<subfield code="y">1997</subfield>
</datafield>
<!-- SLAC-TOPICS = SLAC, there, 09/96; -->
FIXME: what is this for?
<!-- DATE = Sep 1996; -->
<datafield tag="269" ind1=" " ind2=" ">
<subfield code="c">1996-09-00</subfield>
</datafield>
<!-- JOUR-SUB = Phys.Rev.D; -->
NOTE: we use 773 like for PUB-NOTE; if there is no volume/page
information, it means it was submitted to that particular journal.
<!-- PPF-SUBJECT = Experimental, S; -->
<datafield tag="650" ind1="1" ind2="7">
<subfield code="a">Experimental, S</subfield>
</datafield>
FIXME: what is S?
<!--
P = 35;
PPA = 9716;
PPF = 9639;
-->
FIXME: what is P, PPA, PPF?
<!--
CITATION = PHLTA,B217,169;
CITATION = PHLTA,B266,482;
CITATION = IMPAE,A7,4189;
CITATION = APPOA,B23,561;
CITATION = PHRVA,D46,246;
CITATION = HEP-PH 9211282;
CITATION = NUPHA,B399,395;
CITATION = PHRVA,D51,2482;
CITATION = HEP-PH 9411329;
CITATION = HEP-PH 9501210;
CITATION = PRLTA,72,970;
CITATION = RPPHA,58,611;
CITATION = JTPLA,33,67;
CITATION = APNYA,66,509;
CITATION = IMPAE,A2,1447;
CITATION = IMPAE,A4,1527;
CITATION = MPLAE,A8,2747;
CITATION = JTPLA,59,585;
CITATION = PHRVA,D49,5805;
CITATION = HEP-PH 9503325;
CITATION = PHRVA,D9,3113;
CITATION = NUIMA,138,241;
CITATION = NUIMA,140,533;
CITATION = PHLTA,B206,707;
CITATION = ZEPYA,C43,75;
CITATION = HEP-PH 9309235;
CITATION = BAPSA,41,902;
CITATION = BAPSA,41,938;
CITATION = PHRVA,D50,6811;
CITATION = PRPLC,65,151;
CITATION = NUPHA,B370,365;
CITATION = PHRVA,D48,5;
CITATION = CPHCB,46,43;
-->
<datafield tag="999" ind1="C" ind2="5">
<subfield code="m">E. Witten</subfield>
<subfield code="p">253</subfield>
<subfield code="t">Adv. Theor. Math. Phys.</subfield>
<subfield code="v">2</subfield>
<subfield code="y">1998</subfield>
</datafield>
[...]
<!-- EXPERIMENT = FNAL-E-0864; -->
<datafield tag="693" ind1=" " ind2=" ">
<subfield code="e">FNAL-E-0864</subfield>
</datafield>
<!-- PPFIN-ACCT = LIRYG; -->
<datafield tag="693" ind1=" " ind2=" ">
<subfield code="a">LIRYG</subfield>
</datafield>
<!--
DESY-KEYWORDS = data analysis method;
DESY-KEYWORDS = hadron hadron, interaction;
DESY-KEYWORDS = anti-p p, annihilation;
DESY-KEYWORDS = multiple production;
DESY-KEYWORDS = charged particle, hadroproduction;
DESY-KEYWORDS = photon, associated production;
DESY-KEYWORDS = pi, charged particle;
DESY-KEYWORDS = pi0;
DESY-KEYWORDS = multiplicity, moment;
DESY-KEYWORDS = symmetry, chiral;
DESY-KEYWORDS = critical phenomena;
DESY-KEYWORDS = pi, condensation;
DESY-KEYWORDS = numerical calculations, Monte Carlo;
-->
<datafield tag="653" ind1="1" ind2=" ">
<subfield code="9">DESY</subfield>
<subfield code="a">data analysis method</subfield>
</datafield>
<datafield tag="653" ind1="1" ind2=" ">
<subfield code="9">DESY</subfield>
<subfield code="a">hadron hadron, interaction</subfield>
</datafield>
[...]
<!-- DESY-ABS-NUM = D96-20944; -->
<datafield tag="088" ind1=" " ind2=" ">
<subfield code="9">DESY</subfield>
<subfield code="a">D96-20944</subfield>
</datafield>
<!--
DESY-CLASS-CODE = G;
DESY-CLASS-CODE = D;
-->
FIXME: what are these codes?
<!-- DATE-UPDATED = 07/08/2004; -->
FIXME: is the the date of metadata/author/fulltext update?
<!-- ACCOUNT-UPD = LI.KAL; -->
FIXME: username who updated record? or who has rights to do so?
<!-- DATE-ADDED = 09/18/1996; -->
<datafield tag="961" ind1=" " ind2=" ">
<subfield code="c">20070121</subfield>
<subfield code="x">19960918</subfield>
</datafield>
NOTE: creation ($x) and modification ($c) dates are also usually
stored elsewhere (bibrec), not in metadata
<!-- ACCT-ADDED = CITES; -->
FIXME: username who created record? (tag 859 in this case). Or expresses other rights?
<!-- TT = Analysis of charged-particle/photon correlations in hadronic multiparticle production.; -->
FIXME: what is TT?
<!-- BULL = HEPPH-9609375; -->
<datafield tag="037" ind1=" " ind2=" ">
<subfield code="a">hep-ph/9608375</subfield>
</datafield>
<!--
URL = ADSABS;
URLDOC = 1997PhRvD..55.5667B;
URL = PHRVA-D;
URLDOC = V55/P05667/;
URL = SLACPUB;
URLDOC = 9865;
-->
<datafield tag="856" ind1="4" ind2=" ">
<subfield code="u">http://foo/bar</subfield>
<subfield code="y">Fulltext in ADS</subfield>
</datafield>
NOTE: Invenio can either store external links or just external IDs and
use create links dynamically based on some rules and knowledge bases.
<!-- TIME-UPD = 11:47:06; -->
Note: see DATE-UPDATED, Invenio stores dates with granularity of seconds
<!--
PACS = 13.87.Ce;
PACS = 14.40.Aq;
PACS = 14.70.Bh;
-->
<datafield tag="650" ind1="1" ind2="7">
<subfield code="2">PACS</subfield>
<subfield code="a">13.87.Ce</subfield>
</datafield>
<datafield tag="650" ind1="1" ind2="7">
<subfield code="2">PACS</subfield>
<subfield code="a">14.40.Aq</subfield>
</datafield>
<datafield tag="650" ind1="1" ind2="7">
<subfield code="2">PACS</subfield>
<subfield code="a">14.70.Bh</subfield>
</datafield>
<!-- NEW-DESY-CHECK = Preprint - Brooks, T.C. (rec.Sep.96) 35 p.; -->
FIXME: what is NEW-DESY-CHECK?
<!--
CATDATE = 09/20/1996;
CATTIME = 10:44:34;
-->
FIXME: how this differs from DATE-UPDATED and ACCOUNT-UPD?
<!-- DOI = 10.1103/PhysRevD.55.5667; -->
Note: usually stored with the publication reference in 773, see above.
</record>
An exhaustive record example
FIXME. But see
Invenio markup documentation links.
Mapping of SPIRES and Invenio bibliographic fields
SPIRES |
Invenio |
Notes |
SPIRES O(bsolete)/D(eprecated) |
SPIRES Example |
IRN (NR) |
001 |
record ID |
|
7199236 |
DOC-TYPE (NR) |
980 |
collection indicator |
D |
Preprint |
REPORT-NUM |
037 |
we also use 088 to store additional report numbers |
|
SLAC-REPRINT-1999-091 |
ASTR |
100 or 700 |
first author into 100, additional authors into 700 (astr is the name of a block of authors with same affiliations-has no value itself) |
|
|
AUTHOR |
100/700 $a |
author name |
|
Brooks, Travis C. |
DESY-AUTHOR (NR) |
??? |
alternate author name for non-ascii chars |
AUTHOR = Stohr, J.;DESY-AUTHOR = Stoehr, J.; |
AFFILIATION |
100/700 $u |
author affiliations (repeatable) |
|
St. Petersburg, INP |
CORP-AUTHOR |
|
allows author to be entity rather than name |
D??? |
IRN = 4719409; CORP-AUTHOR = NIKHEF, Amsterdam; |
COL-NOTE |
710 |
collaboration |
L3 Collaboration |
TITLE |
245 |
title |
|
Recalculation of proton Compton scattering in perturbative QCD |
PUB-NOTE |
773 |
publication reference |
|
Phys.Rev.D61:032003,2000 |
SLAC-TOPICS |
|
SLAC "Keywords" for SLAC documents |
D |
DATE |
269 |
imprint |
|
Apr 2001 (actually stored in internal format as 20010400 ) |
JOUR-SUB |
773 |
we use 773 like for PUB-NOTE; if there is no volume/page information, it means it was submitted to that particular journal |
Phys.Rev.D |
PPF-SUBJECT |
65017 |
subject category |
D |
P (NR) |
|
number of pages |
CITATION |
999 |
references |
EXPERIMENT |
693 $a |
experiment |
DESY-KEYWORDS |
6531 $9 DESY |
keywords attributed by DESY |
DESY-CLASS-CODE |
|
??? |
D or O |
TT |
|
arXiv title/TeX title |
BULL |
037 |
arXiv report number |
URL |
8564 |
external fulltext links; note that Invenio can either store external links or just external IDs and use create links dynamically based on some rules and knowledge bases. |
TIME-UPD |
961, but usually stored elsewhere |
see DATE-UPDATED, Invenio stores dates with granularity of seconds |
PACS |
65017 |
PACS subject categories; various subject categories can be stored in 650 |
DOI |
773 $a |
DOI is stored with the publication reference |
SPIRES-to-Invenio Record Conversion Tools
A
SPIRES2MARC.xsl
stylesheet is available in the
DevelopmentInspireCodeRepository
repository under
bibconvert
directory. See the
README
file
located there. An example of usage:
$ ls -l spires.xml # dump records from SPIRES into 'spires.xml'
$ bibconvert -c SPIRES2MARC.xsl < spires.xml > marc.xml # convert records to MARCXML
$ xmllint --format marc.xml # inspect nicely formatted MARCXML
$ xmllint --noout marc.xml # check compliance to XML standard
$ xmlmarclint marc.xml # check compliance to MARCXML standard
$ bibupload -ir marc.xml # upload records into Invenio in insert-or-replace mode