Thesaurus Construction & Metadata / LIS9202 – W2027

Next time taught: Winter 2027 Term (preliminary information as of 02/12/2026. To Be Confirmed.) Open to all MLIS students.
Location: FIMS & Nursing Building at the University of Western Ontario, London, CANADA

Course Description:

Theory and practice in indexing and in constructing subject retrieval languages in thesaurus form. Distinguishing between controlled and natural language indexing, and between subject headings and index terms. Applying facet analysis to thesaurus construction. Selected topics in the theory of subject analysis. A new significant component of the course will overview current metadata and linked data initiatives and discuss how various metadata standards support subject access.

Course Content/Readings:
Part I.

Thesaurus Construction
Introduction. Thesaurus: Definitions. Functions. Subject access & retrieval tools. ERIC Thesaurus.
Thesaurus: Types, formats & elements. Building thesauri: vocabulary collection and term extraction
Building thesauri: facets. Facet analysis. Thesaurus software.
Building thesauri: hierarchical relations. Subject headings & index terms
Building thesauri: equivalence and associative relations. Controlled & natural language indexes.
Knowledge Organization trends. Powering Web-Search Systems
Practical workshop.

Sample Readings (Part I)

Ryan, C. (2014) Thesaurus construction guidelines: An introduction to thesauri and guidelines on their construction. Dublin: Royal Irish Academy and National Library of Ireland. ISSN: 2009-6461. DOI: 10.3318/DRI.2014.1
Shiri, A. (2012). Powering Search: the Role of Thesauri in New Information Environments. Medford, NJ: Published on behalf of the American Society for Information Science and Technology by Information Today.
In addition to the seminal interpretive literature (above), we examine and practice the essential parts a NISO* Standard that offers “guidelines and conventions for the contents, display, construction, testing, maintenance, and management of monolingual controlled vocabularies:”ANSI/NISO Z39.19-2005 (R2010). Guidelines for the Construction, Format, and Management of Monolingual Controlled Vocabularies
- *NISO, the National Information Standards Organization, is a non-profit association accredited by the American National Standards Institute (ANSI). “NISO is where content publishers, libraries, and software developers turn for information industry standards that allow them to work together. Through NISO, all of these communities are able to collaborate on mutually accepted standards—solutions that enable them to better serve their customers, enhancing their operations today and forming a foundation for the future” (according to their website, https://www.niso.org/what-we-do accessed on 10/28/2024).

Part II.

Metadata
Metadata: Definitions. Functions. Typologies. Dublin Core.
Metadata: Encoding standards. Practical issues.
Metadata for: Governmental Resources, Art & Architecture Works, Educational Materials, Geographic Resources. Case studies.

Sample Readings (Part II)

Introduction to Metadata. (2016) Baca, M., The Getty Research Institute. Los Angeles.
Pomerantz, J. (2015). MIT Press. Cambridge, Massachusetts.

Part III. Linked Data.

What’s beyond metadata? Semantic Web. Linked Data. Web of Data.
“Library Linked Data in the Cloud” by OCLC.
Future of Metadata. Reflections on inter-connectedness.

Sample Readings (Part III)

Godby, C. J., Wang, S., Mixter, J. K. (2015) Library Linked Data in the Cloud: OCLC’s Experiments with New Models of Resource Description. In the Semantic Web: Theory and Technology Series. Morgan & Claypool.

Assignments:

Students will complete three (3) reports and present in the practical workshop. The main class Project will have 5 sub-components that need to be accomplished and submitted in the course of 5-6 weeks.

Report 1 (Analysis of ERIC indexing terms) – 15 %
Mini-Thesaurus Construction Project – 50 % (total, with 5 sub-parts)

See sub-component below:
Part 2.1. Collection of complex subject descriptions & indexable terms (5%)
Part 2.2. Categorization of subject area by facets (10%)
Part 2.3. Demonstration of hierarchies and broad/narrow term relations in subject area (15%)
Part 2.4. Demonstration of associative relations and 2 types of thesaural displays (10%)
Part 2.5. Written analytical Report (2) (10%) and presentation in the class workshop

Report 3 (Metadata analysis & Reflection) – 25%
Participation – 10 %

The course is under considerable review as the new digital environments require new methods to be reviewed and taught. The original formal course objectives were (excluding the more innovative material in parts II and III):

To teach students how to analyse the subject of a document, and to translate that expression into a suitable set of index terms.
To expand students’ knowledge of the structure and use of indexing, the principles of thesaurus construction and theoretical topics in subject analysis.
To provide students with an opportunity to practice indexing and thesaurus construction skills.

SAMPLE XML CODE (for Thesaurus Construction)

[code language="xml"]
<?xml version="1.0" encoding="utf-8"?>

<LIS9202 term='Winter 2017' status='completed'>
<Projects>
<Arthuruan_Thesaurus Title='Aurthurian Thesaurus'>
<Author> Nicole Zvanovec </Author>
<Date> 2017 </Date>
<Comments> This XML code is shared with the class as a simple example of how you could start building up an XML file containing your thesaurus bibliographic metadata (the description of who created it, when, for what purpose, audience, etc.) as well as your thesaurus main facets (or, top terms (TT), BT, NT, USE, UF relations and scope notes (SN). </Comments>
<Purpose> The purpose of this thesaurus is to act as a structured vocabulary... </Purpose>
<Audience> The target audience for this thesaurus is undergraduate students... </Audience>
<Main_Facets>
<Agents> This facet contains concepts that may carry out an operation... </Agents>
<Space> This facet contains terms that refer to physical locations and places. </Space>
</Main_Facets>
</Arthuruan_Thesaurus>
<!--Do you think YOU create a similar block of XML-encoded metadata for your own thesaurus based on this example? Feel free to start by copy-and-pasting the lines between bars into an editor (like the DreamWeaver, Notepad++, or FrontPage). You can delete any content between the tags, add new tags and content, or replace what's irrelevant with the right content for your project..-->
</Projects>
</LIS9202>

[/code]

RESOURCES TO BROWSE:
Educational Videos:
Visit a simple XML tutorial
A good reference source – W3Schools
IBM introduces XML for beginners (2010)
Real world examples of XML use by Linda.com (2014)
Sample Metadata Schemes: METS: Metadata Encoding and Transmission Standard (LC)

Victoria Rubin