What are the best sources of definition content for your glossary? This post discusses several sources including: your most popular reports, pre-existing local glossaries, data dictionaries provided by vendors, and third-party reporting agencies such as IPEDS.
From this list, existing popular reports is the best source of definitions.
Let us take a closer look at each source:
Existing Popular Reports - This is IData’s preferred source because the relevancy of data elements within popular reports correlates to the operational goals of an organization. If a data element frequently appears in your 10 most popular reports, you can infer from this frequency that you should define that data element. For example, the most popular enrollment reports have these column headings: student name, student ID number, course name, grade, instructor, and term. Each of these column headings should be a definition.
Using an existing popular report as a source of definitions ensures that you define data elements that are important. The broader context of the report can also help you capture important related definitions. In the example above, you might broaden the list of things to define by adding current term, fall term, and spring term.
Pre-existing Local Glossaries – Check within your organization for glossaries that were used historically. If you do not have any at your organization, search the internet for glossaries published by peer organizations. You may be able to edit their definitions to conform to your local glossary content standards.
Vendor Provided Information – Vendors may publish a glossary that relates to their application. Locate it on their website or ask for an editable copy.
Often the vendor refers to this material as a "data dictionary", and the material is technical in nature. If business definitions are included within a data dictionary, they tend to be circular and require substantive editing. Sample entries include last_name; first_name; middle_initial; emplid.
When a data dictionary is transposed into a business glossary, the number of entries may shrink by a third. For example, in the data dictionary there are three entries for "employment status" - code name, short code description, and long code description. These three data dictionary entries could collapse into a single business glossary entry labeled "employment status". Another example of collapse pertains to addresses. An “employee home address” glossary entry might have 7 data dictionary fields: address line one, address line two, address line three, city, state, postal code, and country.
Third-Party Reporting Agency – If your organization submits data to a third-party reporting agency, that agency probably provides a glossary. Unlike the data dictionaries discussed above, the writing style for these glossaries serves business readers instead of technical staff. However, the breadth of content might be too broad for your type of organization. For example, in higher education, the IPEDS glossary covers community colleges to large research institutions. You will need to remove the glossary entries that are not relevant to your organization type or size.
These are the starting points for definitions that IData recommends. Most organizations use a mix of these sources. If you need help in implementing data governance or data intelligence, remember that IData provides data governance services. A data governance solution like the Data Cookbook can help in successful implementation of data governance at an organization and improving data quality. Feel free to .
(image credit HubSpot_People_with_Devices_BP #1116)