MARC Stoplist #
The Marc_Stoplist table defines data that is ignored for filing purposes and is therefore tag and subfield specific. In filing, if the rules require the stoplist to be examined, the text is checked for the presence of contents of the stoplist at the beginning of the string. If a match is found, then the text will be stripped before the remaining text is processed for creation of the filing key. For instance, a title would be examined for the presence of ‘The’, ‘A’, and other words – regardless of the setting of the indicator value; a volume field may similarly have stopwords for ‘vol.’, ‘v.’, and ‘volume’ allowing filing to concentrate on the specific part identifier.
The stoplist relates primarily to headings, and some headings displays will display titles for instance with the Stoplist element removed from the beginning of the title and appended to its end.
Note that this stoplist does not refer to stopwords to be ignored in a text retrieval process.
-
- Note the table of articles and language use below: some languages contain articles which do form English words, e.g. Bat, Den, Die, DOS, He, It, Ten, To, Ton – the operator can alter the non-filing indicator dynamically in the cataloguing process; if languages are not likely to be encountered, then parameters may be suppressed.
Articles and language use #
This table indicates the default articles as supplied by AIT (and derived from MARC21 specifications). If languages are not likely to be encountered, then the library may remove the parameter(s) for those particular articles.
Article (PSC_From) | Languages used in |
---|---|
A ^ | English, Gallegan, Hungarian, Portuguese, Romanian, Scots, Yiddish |
A’ ^ | Gaelic |
AL ^ | Romanian |
AL-^ | Arabic, Baluchi, Brahui, Panjabi (Perso-Arabic script), Persian, Turkish, Urdu** |
AM ^ | Gaelic |
AN ^ | English, Gaelic, Irish, Scots, Yiddish |
ANE ^ | Scots |
ANG ^ | Tagalog |
AS ^ | Gallegan, Portuguese |
AZ ^ | Hungarian |
BAT ^ | Basque |
BIR ^ | Turkish |
D’^ | English |
DA ^ | Shetland English |
DAS ^ | German |
DE ^ | Danish, Dutch, English, Friesian, Norwegian (Bokmal), Swedish |
DEI ^ | Norwegian (Nynorsk) |
DEM ^ | German |
DEN ^ | Danish, German, Norwegian, Swedish |
DER ^ | German, Yiddish |
DES ^ | German |
DET ^ | Danish, Norwegian, Swedish |
DI ^ | Yiddish |
DIE ^ | Afrikaans, German, Yiddish |
DOS ^ | Yiddish |
E ^ | Norwegian |
‘E ^ | Friesian |
EEN ^ | Dutch |
EENE ^ | Dutch |
EGY ^ | Hungarian |
EI ^ | Norwegian (Nynorsk) |
EIN ^ | German, Norwegian (Nynorsk), Yiddish |
EINE ^ | German |
EINEM ^ | German |
EINEN ^ | German |
EINER ^ | German |
EINES ^ | German |
EIT ^ | Norwegian (Nynorsk) |
EL ^ | Catalan, Spanish |
EL-^ | Arabic |
ELS ^ | Catalan |
EN ^ | Catalan, Danish, Norwegian (Bokmal), Swedish |
ET ^ | Danish, Norwegian (Bokmal) |
ETT ^ | Swedish |
EYN ^ | Yiddish |
EYNE ^ | Yiddish |
GL’^ | Italian |
GLI ^ | Italian |
HA-^ | Hebrew |
HAI ^ | Greek |
HE ^ | Hawaiian |
HE-^ | Greek |
HEIS ^ | Greek, Modern |
HEN ^ | Greek, Modern |
HENA ^ | Greek, Modern |
HENAS ^ | Greek, Modern |
HET ^ | Dutch |
HIN ^ | Icelandic |
HINA ^ | Icelandic |
HINAR ^ | Icelandic |
HINIR ^ | Icelandic |
HINN ^ | Icelandic |
HINNA ^ | Icelandic |
HINNAR ^ | Icelandic |
HINNI ^ | Icelandic |
HINS ^ | Icelandic |
HINU ^ | Icelandic |
HINUM ^ | Icelandic |
HO ^ | Greek |
HO- ^ | Hebrew |
HOI ^ | Greek |
I ^ | Italian |
IH’^ | Provencal |
IL ^ | Italian, Provencal/Langue d’Oc |
IL-^ | Maltese |
IN ^ | Friesian |
IT ^ | Friesian |
KA ^ | Hawaiian |
KE ^ | Hawaiian |
L’^ | Catalan, French, Italian, Provencal/Langue d’Oc |
L-^ | Maltese |
LA ^ | Catalan, Esperanto, French, Italian, Provencal/Langue d’Oc, Spanish |
LAS ^ | Provencal/Langue d’Oc, Spanish |
LE ^ | French, Italian, Provencal/Langue d’Oc |
LES ^ | Catalan, French, Provencal/Langue d’Oc |
LH ^ | Provencal/Langue d’Oc |
LHI ^ | Provencal/Langue d’Oc |
LI ^ | Provencal/Langue d’Oc |
LIS ^ | Provencal/Langue d’Oc |
LO ^ | Italian, Provencal/Langue d’Oc, Spanish |
LOS ^ | Provencal/Langue d’Oc, Spanish |
LOU ^ | Provencal/Langue d’Oc |
LU ^ | Provencal/Langue d’Oc |
MGA ^ | Tagalog |
MIA ^ | Greek, Modern |
‘N ^ | Afrikaans, Dutch |
NA ^ | Gaelic, Hawaiian, Irish |
NJE ^ | Albanian |
NJI ^ | Albanian |
NY ^ | Malagasy |
O ^ | Gallegan, Hawaiian, Portuguese, Romanian |
‘O ^ | Neapolitan |
OS ^ | Portuguese |
‘R ^ | Icelandic |
‘S ^ | German |
SA ^ | Tagalog |
SI ^ | Tagalog |
SINA ^ | Tagalog |
‘T ^ | Dutch, Friesian |
TA ^ | Greek |
TAIS ^ | Greek |
TAS ^ | Greek |
TE ^ | Greek |
TEN ^ | Greek |
TES ^ | Greek |
THE ^ | English |
TO ^ | Greek |
TOIS ^ | Greek |
TON ^ | Greek |
TOU ^ | Greek |
UM ^ | Portuguese |
UMA ^ | Portuguese |
UN ^ | Catalan, French, Italian, Provencal/Langue d’Oc, Romanian, Spanish |
UN’^ | Italian |
UNA ^ | Catalan, Italian, Provencal/Langue d’Oc, Spanish |
UNE ^ | French |
UNEI ^ | Romanian |
UNHA ^ | Gallegan |
UNO ^ | Italian, Provencal/Langue d’Oc |
UNS ^ | Provencal/Langue d’Oc |
UNUI ^ | Romanian |
US ^ | Provencal/Langue d’Oc |
Y ^ | Welsh |
YE ^ | English |
YR ^ | Welsh |
-
- al- is meant to cover alternate romanizations on the initial article, e.g. as-