UnitSegmentLayout

Fully qualified class name: DDICDIModels::DDICDILibrary::Classes::FormatDescription::UnitSegmentLayout

Definition

Description of unit-record (“wide”) data sets, where each row in the data set provides the same group of values for variables all relating to a single unit.

Examples

A simple spreadsheet. Commonly the first row of the table will contain variable names or descriptions.

The following CSV file has a rectangular layout and would import into a simple table in a spreadsheet:

PersonId,AgeYr,HeightCm
1,22,183,
2,45,175,

Explanatory notes

This is the classic rectangular data table used by most statistical packages, with rows/cases/observations and columns/variables/measurements. Each cell (DataPoint) in the table is the intersection of a Unit (row) and an InstanceVariable. Each logical column will contain data relating to the values for a single variable.

Diagram

Inheritance
  ᐊ── FormatDescription::UnitSegmentLayout
Attributes

Name

Inherited from

Description

Data Type

Multiplicity

Default value

allowsDuplicates

FormatDescription::PhysicalSegmentLayout

If value is False, the members are unique within the collection - if True, there may be duplicates. (Note that a mathematical “bag” permits duplicates and is unordered - a “set” does not have duplicates and may be ordered.)

Boolean

1..1

arrayBase

FormatDescription::PhysicalSegmentLayout

The starting value for the numbering of cells, rows, columns, etc. when they constitute an ordered sequence (an array). Note that in DDI, this is typically either 0 or 1. In related W3C work (Model for Tabular Data and Metadata on the Web), they appear to standardize on 1 (see https://www.w3.org/TR/tabular-data-model/ 4.3 [Columns] and 4.4 [Rows]: “number - the position of the column amongst the columns for the associated table, starting from 1.”)

Integer

0..1

catalogDetails

FormatDescription::PhysicalSegmentLayout

Bundles the information useful for a data catalog entry.

Examples would be creator, contributor, title, copyright, embargo, and license information

A set of information useful for attribution, data discovery, and access. This is information that is tied to the identity of the object. If this information changes the version of the associated object changes.

CatalogDetails

0..1

commentPrefix

FormatDescription::PhysicalSegmentLayout

A string used to indicate that an input line is a comment, a string which precedes a comment in the data file. From https://www.w3.org/TR/tabular-metadata/ 5.9 Dialect commentPrefix: ‘An atomic property that sets the comment prefix flag to the single provided value, which MUST be a string. The default is “#”.’

String

0..1

delimiter

FormatDescription::PhysicalSegmentLayout

The Delimiting character in the data. Must be used if isDelimited is True. “The separator between cells, set by the delimiter property of a dialect description. The default is ,. See the W3C Recommendation “Metadata Vocabulary for Tabular Data” (https://www.w3.org/TR/tabular-data-model/#encoding). From the “CSV Dialect” specification (https://specs.frictionlessdata.io/csv-dialect/#specification): “delimiter: specifies a one-character string to use as the field separator. Default = ,.”

String

0..1

,

encoding

FormatDescription::PhysicalSegmentLayout

The character encoding of the represented data. From the W3C Recommendation “Metadata Vocabulary for Tabular Data” (https://www.w3.org/TR/tabular-metadata/) 5.9 Dialect: “encoding - An atomic property that sets the encoding flag to the single provided string value, which MUST be a defined in [encoding]. The default is ‘utf-8’.” From the same W3C recommendation 7.2 Encoding: “CSV files should be encoded using UTF-8, and should be in Unicode Normal Form C as defined in [UAX15]. If a CSV file is not encoded using UTF-8, the encoding should be specified through the charset parameter in the Content-Type header.”

ControlledVocabularyEntry

0..1

escapeCharacter

FormatDescription::PhysicalSegmentLayout

“The string that is used to escape the quote character within escaped cells, or null” see https://www.w3.org/TR/tabular-data-model/#encoding. From https://www.w3.org/TR/tabular-metadata/ 5.9 Dialect “doubleQuote: A boolean atomic property that, if true, sets the escape character flag to “. If false, to . The default is true.” From http://specs.frictionlessdata.io/csv-dialect/ “doubleQuote: controls the handling of quotes inside fields. If true, two consecutive quotes should be interpreted as one. Default = true”.

String

0..1

True

hasHeader

FormatDescription::PhysicalSegmentLayout

True if the file contains a header containing column names. From https://www.w3.org/TR/tabular-metadata/ 5.9 Dialect “header: A boolean atomic property that, if true, sets the header row count flag to 1, and if false to 0, unless headerRowCount is provided, in which case the value provided for the header property is ignored. The default is true.” From http://specs.frictionlessdata.io/csv-dialect/ “header: indicates whether the file includes a header row. If true the first row in the file is a header row, not data. Default = true”.

Boolean

0..1

true

headerIsCaseSensitive

FormatDescription::PhysicalSegmentLayout

If True, the case of the labels in the header is significant. From the “CSV Dialect” specification (http://specs.frictionlessdata.io/csv-dialect/): “caseSensitiveHeader: indicates that case in the header is meaningful. For example, columns CAT and Cat should not be equated. Default = false.”

Boolean

0..1

false

headerRowCount

FormatDescription::PhysicalSegmentLayout

The number of lines in the header From https://www.w3.org/TR/tabular-metadata/ 5.9 Dialect “headerRowCount: A numeric atomic property that sets the header row count flag to the single provided value, which MUST be a non-negative integer. The default is 1.”

Integer

0..1

1

identifier

FormatDescription::PhysicalSegmentLayout

Identifier for objects requiring short- or long-lasting referencing and management.

Identifier

0..1

isDelimited

FormatDescription::PhysicalSegmentLayout

Indicates whether the data are in a delimited format. If “true,” the format is delimited, and the isFixedWidth property must be set to “false.” If not set to “true,” the property isFixedWitdh must be set to “true.”

Boolean

1..1

isFixedWidth

FormatDescription::PhysicalSegmentLayout

Set to true if the file is fixed-width. If true, isDelimited must be set to false.

Boolean

1..1

lineTerminator

FormatDescription::PhysicalSegmentLayout

The strings that can be used at the end of a row, set by the lineTerminators property of a dialect description. The default is [CRLF, LF]. See the W3C Recommendation “Metadata Vocabulary for Tabular Data” (https://www.w3.org/TR/tabular-data-model/#encoding) 5.9 Dialect “lineTerminators: An atomic property that sets the line terminators flag to either an array containing the single provided string value, or the provided array. The default is [‘rn’, ‘n’].” Also, from the “CSV Dialect” specification (http://specs.frictionlessdata.io/csv-dialect/): “lineTerminator: specifies the character sequence which should terminate rows. Default = rn.”

String

0..*

[CRLF, LF]

name

FormatDescription::PhysicalSegmentLayout

A linguistic signifier. Human understandable name (word, phrase, or mnemonic) that reflects the ISO/IEC 11179-5 naming principles. If more than one name is provided provide a context to differentiate usage.

ObjectName

0..*

nullSequence

FormatDescription::PhysicalSegmentLayout

A string indicating a null value. From the W3C Recommendation “Metadata Vocabulary for Tabular Data” (https://www.w3.org/TR/tabular-metadata/) 4.3: “null: the string or strings which cause the value of cells having string value matching any of these values to be null.” From the same source, Inherited 5.7: “null: An atomic property giving the string or strings used for null values within the data. If the string value of the cell is equal to any one of these values, the cell value is null. See Parsing Cells in [tabular-data-model] for more details. If not specified, the default for the null property is the empty string ‘’. The value of this property becomes the null annotation for the described column.”

String

0..1

overview

FormatDescription::PhysicalSegmentLayout

Short natural language account of the information obtained from the combination of properties and relationships associated with an object.

InternationalString

0..1

purpose

FormatDescription::PhysicalSegmentLayout

Intent or reason for the object/the description of the object.

InternationalString

0..1

quoteCharacter

FormatDescription::PhysicalSegmentLayout

“The string that is used around escaped cells, or null, set by the quoteChar property of a dialect description. The default is “.”. See W3C Recommendation “Model for Tabular Data and Metadata on the Web”, https://www.w3.org/TR/tabular-data-model/#parsing. From the W3C Recommendation “Metadata Vocabulary for Tabular Data” (https://www.w3.org/TR/tabular-metadata/) 5.9 Dialect: “quoteChar: An atomic property that sets the quote character flag to the single provided value, which MUST be a string or null. If the value is null, the escape character flag is also set to null. The default is ‘”’.” From the CSV Dialect specification (http://specs.frictionlessdata.io/csv-dialect/): “quoteChar: specifies a one-character string to use as the quoting character. Default = “.”

String

0..1

skipBlankRows

FormatDescription::PhysicalSegmentLayout

If the value is True, blank rows are ignored. From the W3C Recommendation “Metadata Vocabulary for Tabular Data” (https://www.w3.org/TR/tabular-metadata/) 5.9 Dialect: “skipBlankRows: A boolean atomic property that sets the skip blank rows flag to the single provided boolean value. The default is false.”

Boolean

0..1

false

skipDataColumns

FormatDescription::PhysicalSegmentLayout

The number of columns to skip at the beginning of the row. From the W3C Recommendation “Metadata Vocabulary for Tabular Data” (https://www.w3.org/TR/tabular-metadata/) 5.9 Dialect: “skipColumns: A numeric atomic property that sets the skip columns flag to the single provided numeric value, which MUST be a non-negative integer. The default is 0.” A value other than 0 will mean that the source numbers of columns will be different from their numbers.

Integer

0..1

0

skipInitialSpace

FormatDescription::PhysicalSegmentLayout

If the value is True, skip whitespace at the beginning of a line or following a delimiter. From the W3C Recommendation “Metadata Vocabulary for Tabular Data” (https://www.w3.org/TR/tabular-metadata/) 5.9 Dialect: “skipInitialSpace: A boolean atomic property that, if true, sets the trim flag to ‘start’ and if false, to false. If the trim property is provided, the skipInitialSpace property is ignored. The default is false.” From the CSV Dialect specification (http://specs.frictionlessdata.io/csv-dialect/): “skipInitialSpace: specifies how to interpret whitespace which immediately follows a delimiter; if false, it means that whitespace immediately after a delimiter should be treated as part of the following field. Default = true.”

Boolean

0..1

true

skipRows

FormatDescription::PhysicalSegmentLayout

Number of input rows to skip preceding the header or data. From the W3C Recommendation “Metadata Vocabulary for Tabular Data” (https://www.w3.org/TR/tabular-metadata/) 5.9 Dialect: “skipRows: A numeric atomic property that sets the skip rows flag to the single provided numeric value, which MUST be a non-negative integer. The default is 0.” A value greater than 0 will mean that the source numbers of rows will be different from their numbers.

Integer

0..1

0

tableDirection

FormatDescription::PhysicalSegmentLayout

Indicates the direction in which columns are arranged in each row. From the W3C Recommendation “Metadata Vocabulary for Tabular Data” (https://www.w3.org/TR/tabular-metadata/) 5.3.2: “tableDirection: An atomic property that MUST have a single string value that is one of ‘rtl’, ‘ltr’, or ‘auto’. Indicates whether the tables in the group should be displayed with the first column on the right, on the left, or based on the first character in the table that has a specific direction. The value of this property becomes the value of the table direction annotation for all the tables in the table group. See Bidirectional Tables in [tabular-data-model] for details. The default value for this property is ‘auto’.”

TableDirectionValues

0..1

“Auto”

textDirection

FormatDescription::PhysicalSegmentLayout

Indicates the reading order of text within cells. From the W3C Recommendation “Metadata Vocabulary for Tabular Data” (https://www.w3.org/TR/tabular-metadata/) Inherited 5.7: “textDirection: An atomic property that MUST have a single string value that is one of ‘ltr’, ‘rtl’, ‘auto’ or ‘inherit’ (the default). Indicates whether the text within cells should be displayed as left-to-right text (ltr), as right-to-left text (rtl), according to the content of the cell (auto) or in the direction inherited from the table direction annotation of the table. The value of this property determines the text direction annotation for the column, and the text direction annotation for the cells within that column: if the value is inherit then the value of the text direction annotation is the value of the table direction annotation on the table, otherwise it is the value of this property. See Bidirectional Tables in [tabular-data-model] for details.”

TextDirectionValues

0..1

treatConsecutiveDelimitersAsOne

FormatDescription::PhysicalSegmentLayout

If the value is True, consecutive (adjacent) delimiters are treated as a single delimiter; if the value is False consecutive (adjacent) delimiters indicate a missing value.

Boolean

0..1

trim

FormatDescription::PhysicalSegmentLayout

Specifies which spaces to remove from a data value (start, end, both, neither) From the W3C Recommendation “Metadata Vocabulary for Tabular Data” (https://www.w3.org/TR/tabular-metadata/) 5.9 Dialect: “trim: An atomic property that, if the boolean true, sets the trim flag to true and if the boolean false to false. If the value provided is a string, sets the trim flag to the provided value, which MUST be one of ‘true’, ‘false’, ‘start’, or ‘end’. The default is true.”

TrimValues

0..1

True

Associations

Direction

Association

Description

Multiplicity of UnitSegmentLayout

Package of Other Class

Other Class

Multiplicity of other class

Aggregation Kind

Inherited from

to

InstanceVariable has PhysicalSegmentLayout

0..*

Conceptual

InstanceVariable

0..*

shared

FormatDescription::PhysicalSegmentLayout

from

PhysicalSegmentLayout isDefinedBy Concept

The conceptual basis for the collection of members.

0..*

Conceptual

Concept

0..*

none

FormatDescription::PhysicalSegmentLayout

to

PhysicalLayoutRelationStructure structures PhysicalSegmentLayout

0..1

- own package -

PhysicalLayoutRelationStructure

0..*

none

FormatDescription::PhysicalSegmentLayout

to

PhysicalRecordSegment has PhysicalSegmentLayout

0..1

- own package -

PhysicalRecordSegment

0..*

none

FormatDescription::PhysicalSegmentLayout

from

PhysicalSegmentLayout formats LogicalRecord

Logical record physically represented by the physical layout.

0..*

- own package -

LogicalRecord

0..1

none

FormatDescription::PhysicalSegmentLayout

from

PhysicalSegmentLayout has ValueMapping

0..*

- own package -

ValueMapping

0..*

shared

FormatDescription::PhysicalSegmentLayout

from

PhysicalSegmentLayout has ValueMappingPosition

1..1

- own package -

ValueMappingPosition

0..*

composite

FormatDescription::PhysicalSegmentLayout

Syntax representations / encodings

All syntax representations except the Canonical XMI are provided as reference points for specific implementations, or for use as defaults if sufficient in the form presented.

Fragment for the class UnitSegmentLayout (entire model as XMI)

 1<packagedElement xmlns:StandardProfile="http://www.eclipse.org/uml2/5.0.0/UML/Profile/Standard"
 2                 xmlns:uml="http://www.eclipse.org/uml2/5.0.0/UML"
 3                 xmlns:xmi="http://www.omg.org/spec/XMI/20131001"
 4                 xmi:id="DDICDIModels-DDICDILibrary-Classes-FormatDescription-UnitSegmentLayout"
 5                 xmi:uuid="http://ddialliance.org/Specification/DDI-CDI/1.0/XMI/#UnitSegmentLayout"
 6                 xmi:type="uml:Class">
 7   <ownedComment xmi:id="DDICDIModels-DDICDILibrary-Classes-FormatDescription-UnitSegmentLayout-ownedComment"
 8                 xmi:uuid="http://ddialliance.org/Specification/DDI-CDI/1.0/XMI/#UnitSegmentLayout-ownedComment"
 9                 xmi:type="uml:Comment">
10      <annotatedElement xmi:idref="DDICDIModels-DDICDILibrary-Classes-FormatDescription-UnitSegmentLayout"/>
11      <body>Definition
12==========
13Description of unit-record ("wide") data sets, where each row in the data set provides the same group of values for variables all relating to a single unit.
14
15Examples
16========
17A simple spreadsheet. Commonly the first row of the table will contain variable names or descriptions.
18
19The following CSV file has a rectangular layout and would import into a simple table in a spreadsheet::
20
21   PersonId,AgeYr,HeightCm
22   1,22,183,
23   2,45,175,
24
25Explanatory notes
26=================
27This is the classic rectangular data table used by most statistical packages, with rows/cases/observations and columns/variables/measurements. Each cell (DataPoint) in the table is the intersection of a Unit (row) and an InstanceVariable. Each logical column will contain data relating to the values for a single variable.</body>
28   </ownedComment>
29   <name>UnitSegmentLayout</name>
30   <generalization xmi:id="DDICDIModels-DDICDILibrary-Classes-FormatDescription-UnitSegmentLayout-generalization"
31                   xmi:uuid="http://ddialliance.org/Specification/DDI-CDI/1.0/XMI/#UnitSegmentLayout-generalization"
32                   xmi:type="uml:Generalization">
33      <general xmi:idref="DDICDIModels-DDICDILibrary-Classes-FormatDescription-PhysicalSegmentLayout"/>
34   </generalization>
35</packagedElement>

Fragment for the class UnitSegmentLayout (entire XML Schema)

 1<xs:element name="UnitSegmentLayout"
 2            type="UnitSegmentLayoutXsdType"
 3            xml:id="UnitSegmentLayout">
 4  <!-- based on the UML class DDICDIModels::DDICDILibrary::Classes::FormatDescription::UnitSegmentLayout -->
 5  <xs:annotation>
 6    <xs:documentation>Definition
 7          ==========
 8          Description of unit-record ("wide") data sets, where each row in the data set provides the same group of values for variables all relating to a single unit.
 9          
10          Examples
11          ========
12          A simple spreadsheet. Commonly the first row of the table will contain variable names or descriptions.
13          
14          The following CSV file has a rectangular layout and would import into a simple table in a spreadsheet::
15          
16             PersonId,AgeYr,HeightCm
17             1,22,183,
18             2,45,175,
19          
20          Explanatory notes
21          =================
22          This is the classic rectangular data table used by most statistical packages, with rows/cases/observations and columns/variables/measurements. Each cell (DataPoint) in the table is the intersection of a Unit (row) and an InstanceVariable. Each logical column will contain data relating to the values for a single variable.</xs:documentation>
23  </xs:annotation>
24</xs:element>
25<xs:complexType name="UnitSegmentLayoutXsdType"
26                xml:id="UnitSegmentLayoutXsdType">
27  <xs:annotation>
28    <xs:documentation>Definition
29          ==========
30          Description of unit-record ("wide") data sets, where each row in the data set provides the same group of values for variables all relating to a single unit.
31          
32          Examples
33          ========
34          A simple spreadsheet. Commonly the first row of the table will contain variable names or descriptions.
35          
36          The following CSV file has a rectangular layout and would import into a simple table in a spreadsheet::
37          
38             PersonId,AgeYr,HeightCm
39             1,22,183,
40             2,45,175,
41          
42          Explanatory notes
43          =================
44          This is the classic rectangular data table used by most statistical packages, with rows/cases/observations and columns/variables/measurements. Each cell (DataPoint) in the table is the intersection of a Unit (row) and an InstanceVariable. Each logical column will contain data relating to the values for a single variable.</xs:documentation>
45  </xs:annotation>
46  <xs:complexContent>
47    <xs:extension base="PhysicalSegmentLayoutXsdType">
48      
49    </xs:extension>
50  </xs:complexContent>
51</xs:complexType>

Fragment for the class UnitSegmentLayout (main ontology)

1# class UnitSegmentLayout
2# based on the UML class DDICDIModels::DDICDILibrary::Classes::FormatDescription::UnitSegmentLayout
3cdi:UnitSegmentLayout
4  a rdfs:Class, owl:Class, ucmis:Class;
5  rdfs:label "UnitSegmentLayout";
6  rdfs:comment "Definition\n==========\nDescription of unit-record (\"wide\") data sets, where each row in the data set provides the same group of values for variables all relating to a single unit.\n\nExamples\n========\nA simple spreadsheet. Commonly the first row of the table will contain variable names or descriptions.\n\nThe following CSV file has a rectangular layout and would import into a simple table in a spreadsheet::\n\n   PersonId,AgeYr,HeightCm\n   1,22,183,\n   2,45,175,\n\nExplanatory notes\n=================\nThis is the classic rectangular data table used by most statistical packages, with rows/cases/observations and columns/variables/measurements. Each cell (DataPoint) in the table is the intersection of a Unit (row) and an InstanceVariable. Each logical column will contain data relating to the values for a single variable."@en;
7  rdfs:subClassOf cdi:PhysicalSegmentLayout;
8.

Fragment for the class UnitSegmentLayout (main JSON-LD)

 1{
 2  "@context": [
 3    "PhysicalSegmentLayout.jsonld",
 4    {
 5      "cdi": "http://ddialliance.org/Specification/DDI-CDI/1.0/RDF/",
 6      "xsd": "http://www.w3.org/2001/XMLSchema#",
 7      "UnitSegmentLayout": "cdi:UnitSegmentLayout",
 8      
 9      " comment ": "tag:json-should-support-trailing-commas" 
10    }
11  ],
12  "generatedBy": "This code was generated by the Eclipse Acceleo project UCMIS M2T on 2024-09-23 21:52:59.",
13  "basedOn": "based on the UML data type DDICDIModels::DDICDILibrary::Classes::FormatDescription::UnitSegmentLayout"
14}