Configuration Control for Product Documentation

A Way of Integrating STEP & SGML

Page 6


Annex 1

Semantic Information Classes

Introduction

The set of information objects available to the author/editor will be well defined and semantically meaningful. It is suggested that a set of object classes (see below) be defined for establishing a set of information objects for the STEP environment. This set will have common characteristics which will allow author/editors to create information objects that can be used in different publications within one application (AP) as well as across application boundaries, i.e., between APs.

Information objects are the instances of product documentation containing the concepts, ideas, instructions, or descriptions of the specific aspects of the product, its functions, operations, etc. To aid the authors in creating technical documentation, information objects will be classified in classes. Information classes will provide a descriptive level of semantics which, by distinguishing between semantic types, will help in establishing guidelines for authors and editors. Information classes can assist in choosing the correct information object to describe the proper type of information, and help with the structuring and definition of the information content and flow in a more logical and systematic manner.

As explained above, the information object is a key concept for the integration of product structures and publishing structures. Information objects will belong to the product structures as a descriptive information view of the product. Some of the same information objects may be directly associated with publishing structures, i.e., information directly related to publications or other perceptual presentations. Thus, information objects must be able to belong directly to product structures as well as publishing structures, inferring that the semantic content of their information class must be meaningful and applicable to both structures.

  1. Information classes will provide several levels of functionality:
  2. Information classes will provide ways of working with and disseminating knowledge about the semantics of the types of chosen information objects.
  3. The ability to interchange and share information through the acceptance of common information classes. For example, the same information object could be extracted from the product data and embedded into different publishing structures—as long as the publishing structures recognized the same information class. The publishing structures would also be modelled by DTDs and the information classes would be part of the DTD sharing the information objects.
  4. The different classes of information and their application within the STEP and SGML models will be distinguishable—supporting the mapping between the models.
  5. The different levels of information can be identified (semantically) and referred to by external processes.

Identification and Semantic Content of information Object Classes

Semantic object classes are needed for managing and re-using information by providing a way of classifying, labelling, and identifying information. They can also provide ways of creating and working with information that can enhance its use and comprehension. This is perhaps best understood with an example.

Product documentation, such as a technical manuals, typically contain many instances of the terms process and procedure. Probably due to the similarities in appearance when printed (both processes and procedures often appear printed as lists), these terms are often used incorrectly. A proper identification of the semantics of these terms can help authors to identify and thereby correctly document these instances of these concepts and help publishers construct manuals with the proper explanations and instructions.

This can be achieved through the use of information object classes. By defining information object classes that clearly distinguish a process from a procedure it will be easier for an author to produce the correct information and for publishers to use the information in the correct context.

A process would be an information object class whose content describes change and movement, i.e., a series of events or phases that take place over time and usually have an identifiable purpose or result. Whereas, a procedure would be an information object class whose content contained a series of steps that a person performs in order to obtain a specified outcome. Using these definitions authors can be made aware that the semantic content and purpose of an information object of class process is must be quite different from an information object of class procedure. For example, an author would choose an information object of class process to describe the functioning of a computer program and an information object of class procedure to describe the instructions for the running the program.

Distinguishing between the semantic content of information objects by first semantically distinguishing the different types of information and secondly by grouping them into classes offers an improved level of information. This makes the creation, management, and re-use of the information objects much more viable.

It will be the task of T14 to research and recommended a bounded set of information object classes for product documentation.

The Role of Information Classes

In the section Levels of Interoperability above, there are identified three levels of mapping between the languages, the models, and the data constructs of STEP and SGML. Information classes are one of the key concepts needed for the mapping at the model level. Information classes must be identifiable in both domains to enable the mapping between the standards; just as information objects must be translatable at the data level.

Information classes are also essential for the mapping between models within the STEP domain. For example, a product model and a publishing model may make use of the same information class allowing product data to be mapped directly into the publishing structure.

Information Types in Authoring and Publishing

As explained in the section Semantic Information Classes above, information objects and information classes form part of the infrastructure for capturing relevant semantic information relating to product models and product structures. Thus, the design of information classes will be optimal for the processes of authoring and publishing (publishing in this sense is the design of the information, not the physical formatting, layout, etc.). These information classes need also be conducive to configuration management of the authoring and publishing.

Within the STEP environment, common information classes provide the necessary technique to allow information objects to be mapped and managed between the product and publishing models. This must also be true for the mapping from the publishing structures onto the DTD structures of the SGML environment.

Mapping Information Classes from PUblishing Structures To DTDs

In order to move data from a publishing structure in a STEP environment to an SGML environment, the data must be mapped from the publishing structure onto its related DTD structure. Within the STEP environment, information objects are the lowest level of granularity, i.e., they are leaves on the STEP product and publishing structures. Information objects relate to the semantic information needed to document the product components (including technical manuals).not to publishing. For example, publishing elements such as the paragraph are not at all useful for defining semantic information. A paragraph, similar to a sentence or a word, is used to present information not to describe it.

The same applies to lists, tables, phrases, etc. Data stored in STEP information structures are not directly applicable to publishing, i.e., the form of the data is in computable formats such as real numbers and not in character coded formats. In order to present the data it must be "transformed" from the semantic information objects used in publishing models and structures into a presentation format.

In order to map the product documentation information (stored as STEP entities) onto the document structures of an SGML environment, the information classes, such as procedures, descriptions, and processes, must be mapped onto the corresponding elements structures, i.e., their counterparts in SGML. This also requires that the internal mark-up of information objects such as phrases, paragraphs, lists, and tables be compatible with the corresponding element structures of the DTDs.

Herein lies the

It is here that the SGML_STRING can be fully realized. Each information object can be represented by an SGML_STRING that contains markup. Each information type can be defined to have a specific element structure—which in effect is the mapping form the semantic information type to the publishing (SGML) element structure needed for presentation.(7)

It is recommended that the element structures (markup) of the common information types be defined with the purpose of producing a mapping from the STEP publishing structures to the SGML presentation structures. One of the tasks of T14 will therefore be to initiate discussion and work directed toward defining:

  1. A set of common information types.
  2. The necessary and sufficient elements structures (for the SGML_STRING definitions) of the common information types.

Identifying a Common Set of Information Types

As defined above, an information object, represents ONE idea, concept, or relates to one main point of the product, function, or process that is being described. Information objects are classified into information types. To enable the interchange of information encoded in information objects, a set of common information types must be identified to reflect universal and generic "chunks" of information that express a purpose, meaning, etc.

Using Information Mapping theory

To determine a set of common information types that:

is a daunting and arduous, if not impossible task for a STEP technical group. We therefore have chosen to start with a well-known and accepted technology as an underlying guide. The following points are adapted from Information Mapping(8).

Information Mapping (IM) is based on research into how people read and understand information. IM has developed a methodology for assisting technical writers in creating readable and structured texts for technical documentation.

Many of the principles(9) of Information Mapping can be directly related to the intentions and purposes of writing product documentation, for example:

Information Types: defined by Information Mapping

Information Mapping (ref. INFO_MAP) divides information types into seven domains. We can look at these definitions and use them as an way-forward for defining generic information types for managing and sharing information based on DTDs. (Note: Each information type has corresponding key blocks which are more specific semantic divisions of the information type.)

Procedure

A procedure is a set of steps which are performed in order to obtain a specified outcome.

Process

A process is a series of events or phases that takes place over time and usually has an identifiable purpose or result. Some processes include multiple conditions and results.

Structure

A structure is a physical object or something that can be divided into parts and has boundaries. It includes the description of whole systems such as companies and organizations and their environments.

Concept

A concept is a class or group of items which share a unique combination of critical attributes not shared by other groups, and can be referred to by the same generic name or symbol.

Principle

A principle map is one in which a given body of knowledge presents an important:

Fact

Facts are statements asserted without supporting evidence.

Classification

Classification is sorting a group of items into classes or categories by the use of one more sorting attributes.

A sorting attribute is a quality that is used as the basis for classification.

Internal Structure of Information Types

Information types are key to the mapping of information objects onto presentation structures (DTDs). Information types will be composed of element structures, i.e., a (possible hierarchically structured) set of SGML elements and attributes. To distinguish that these element structures are primarily for presentation, we can refer to them as the presentation element structures. Thus, information types will be defined by a presentation element structure.

Presentation element structures will be defined according to the syntactic rules laid down by the SGML standard; they could take the form of partial DTDs or it could possibly be advantageous to adopt some of the architectural forms techniques used in the HyTime (Ref. ISO_HyTime) standard. However, the requirements here are not the same as the HyTime standard and fixed element types could possibly be an advantage.

The element structures that make up an information type definition should be only used inside an information object, i.e., the element structures should not be used at the modelling level either for products or product documentation.

Presentation Element Structures

Common element structures, such as paragraphs, lists, and tables, will be the key presentation element structures. For defining information types it is proposed that a fixed set of presentation element structures be established.

There are several reasons for defining a fixed set of presentation element structures:

  1. It will provide a consensus of the element structures essential to document design, i.e., DTD design. Therefore by establishing a fixed set of them, it will make the design of DTDs easier and the maintenance less difficult.(10)
  2. It allows the modelling and interchange of (low-level) structured text between STEP models (publishing structures) and the SGML environment (DTDs). (Within STEP the presentation element structures are not at all visible, but are represented as SGML_STRINGS.)

Referencing Information Object Types

This can be achieved by using the techniques formalised in the HyTime standard (ISO 10744) called architectural forms.

A architectural form is an element structure or an attribute list construct which always contains an associated fixed attribute declaring the type and association of the construct. This attribute identifies data as belonging to a particular (fixed or standardized) declaration.

(SGML inherently suppresses word and sentence mark-up, i.e., spaces, periods, capitals, however, explicitly removes traditional paragraph mark-up, i.e., new-line, indent, etc.)


Footnotes:

(7) This would be a non-trivial task if the information object types are not designed with this in mind.

(8) Information Mapping is a methodology for writing structured documentation developed by Robert E. Horn and colleagues from 1967 through 1971 and which since then has been taught to thousands of technical writers throughout the world.

(9) Information mapping is primarily aimed at writers and therefore many of the principles inherent in the work are aimed at presentation, i.e., formatting. These points are not relevant at this time for T14.

(10) Today, much document content (paragraphs, lists, tables, etc.) cannot be exchanged as the basic markup is different, i.e., the DTDs have slight differing content structures. This is seen as an unnecessary hindrance to data interchange and sharing.


Previous Page | Next Page | Table of Contents