UnitSegmentLayout¶

Fully qualified class name: DDICDIModels::DDICDILibrary::Classes::FormatDescription::UnitSegmentLayout

Definition¶

Description of unit-record (“wide”) data sets, where each row in the data set provides the same group of values for variables all relating to a single unit.

Examples¶

A simple spreadsheet. Commonly the first row of the table will contain variable names or descriptions.

The following CSV file has a rectangular layout and would import into a simple table in a spreadsheet:

PersonId,AgeYr,HeightCm
1,22,183,
2,45,175,

Explanatory notes¶

This is the classic rectangular data table used by most statistical packages, with rows/cases/observations and columns/variables/measurements. Each cell (DataPoint) in the table is the intersection of a Unit (row) and an InstanceVariable. Each logical column will contain data relating to the values for a single variable.

Attributes

Name	Inherited from	Description	Data Type	Multiplicity	Default value
allowsDuplicates	FormatDescription::PhysicalSegmentLayout	If value is False, the members are unique within the collection - if True, there may be duplicates. (Note that a mathematical “bag” permits duplicates and is unordered - a “set” does not have duplicates and may be ordered.)	Boolean	1..1
arrayBase	FormatDescription::PhysicalSegmentLayout	The starting value for the numbering of cells, rows, columns, etc. when they constitute an ordered sequence (an array). Note that in DDI, this is typically either 0 or 1. In related W3C work (Model for Tabular Data and Metadata on the Web), they appear to standardize on 1 (see https://www.w3.org/TR/tabular-data-model/ 4.3 [Columns] and 4.4 [Rows]: “number - the position of the column amongst the columns for the associated table, starting from 1.”)	Integer	0..1
catalogDetails	FormatDescription::PhysicalSegmentLayout	Bundles the information useful for a data catalog entry. Examples would be creator, contributor, title, copyright, embargo, and license information A set of information useful for attribution, data discovery, and access. This is information that is tied to the identity of the object. If this information changes the version of the associated object changes.	CatalogDetails	0..1
commentPrefix	FormatDescription::PhysicalSegmentLayout	A string used to indicate that an input line is a comment, a string which precedes a comment in the data file. From https://www.w3.org/TR/tabular-metadata/ 5.9 Dialect commentPrefix: ‘An atomic property that sets the comment prefix flag to the single provided value, which MUST be a string. The default is “#”.’	String	0..1
delimiter	FormatDescription::PhysicalSegmentLayout	The Delimiting character in the data. Must be used if isDelimited is True. “The separator between cells, set by the delimiter property of a dialect description. The default is ,. See the W3C Recommendation “Metadata Vocabulary for Tabular Data” (https://www.w3.org/TR/tabular-data-model/#encoding). From the “CSV Dialect” specification (https://specs.frictionlessdata.io/csv-dialect/#specification): “delimiter: specifies a one-character string to use as the field separator. Default = ,.”	String	0..1	,
encoding	FormatDescription::PhysicalSegmentLayout	The character encoding of the represented data. From the W3C Recommendation “Metadata Vocabulary for Tabular Data” (https://www.w3.org/TR/tabular-metadata/) 5.9 Dialect: “encoding - An atomic property that sets the encoding flag to the single provided string value, which MUST be a defined in [encoding]. The default is ‘utf-8’.” From the same W3C recommendation 7.2 Encoding: “CSV files should be encoded using UTF-8, and should be in Unicode Normal Form C as defined in [UAX15]. If a CSV file is not encoded using UTF-8, the encoding should be specified through the charset parameter in the Content-Type header.”	ControlledVocabularyEntry	0..1
escapeCharacter	FormatDescription::PhysicalSegmentLayout	“The string that is used to escape the quote character within escaped cells, or null” see https://www.w3.org/TR/tabular-data-model/#encoding. From https://www.w3.org/TR/tabular-metadata/ 5.9 Dialect “doubleQuote: A boolean atomic property that, if true, sets the escape character flag to “. If false, to . The default is true.” From http://specs.frictionlessdata.io/csv-dialect/ “doubleQuote: controls the handling of quotes inside fields. If true, two consecutive quotes should be interpreted as one. Default = true”.	String	0..1	True
hasHeader	FormatDescription::PhysicalSegmentLayout	True if the file contains a header containing column names. From https://www.w3.org/TR/tabular-metadata/ 5.9 Dialect “header: A boolean atomic property that, if true, sets the header row count flag to 1, and if false to 0, unless headerRowCount is provided, in which case the value provided for the header property is ignored. The default is true.” From http://specs.frictionlessdata.io/csv-dialect/ “header: indicates whether the file includes a header row. If true the first row in the file is a header row, not data. Default = true”.	Boolean	0..1	true
headerIsCaseSensitive	FormatDescription::PhysicalSegmentLayout	If True, the case of the labels in the header is significant. From the “CSV Dialect” specification (http://specs.frictionlessdata.io/csv-dialect/): “caseSensitiveHeader: indicates that case in the header is meaningful. For example, columns CAT and Cat should not be equated. Default = false.”	Boolean	0..1	false
headerRowCount	FormatDescription::PhysicalSegmentLayout	The number of lines in the header From https://www.w3.org/TR/tabular-metadata/ 5.9 Dialect “headerRowCount: A numeric atomic property that sets the header row count flag to the single provided value, which MUST be a non-negative integer. The default is 1.”	Integer	0..1	1
identifier	FormatDescription::PhysicalSegmentLayout	Identifier for objects requiring short- or long-lasting referencing and management.	Identifier	0..1
isDelimited	FormatDescription::PhysicalSegmentLayout	Indicates whether the data are in a delimited format. If “true,” the format is delimited, and the isFixedWidth property must be set to “false.” If not set to “true,” the property isFixedWitdh must be set to “true.”	Boolean	1..1
isFixedWidth	FormatDescription::PhysicalSegmentLayout	Set to true if the file is fixed-width. If true, isDelimited must be set to false.	Boolean	1..1
lineTerminator	FormatDescription::PhysicalSegmentLayout	The strings that can be used at the end of a row, set by the lineTerminators property of a dialect description. The default is [CRLF, LF]. See the W3C Recommendation “Metadata Vocabulary for Tabular Data” (https://www.w3.org/TR/tabular-data-model/#encoding) 5.9 Dialect “lineTerminators: An atomic property that sets the line terminators flag to either an array containing the single provided string value, or the provided array. The default is [‘rn’, ‘n’].” Also, from the “CSV Dialect” specification (http://specs.frictionlessdata.io/csv-dialect/): “lineTerminator: specifies the character sequence which should terminate rows. Default = rn.”	String	0..*	[CRLF, LF]
name	FormatDescription::PhysicalSegmentLayout	A linguistic signifier. Human understandable name (word, phrase, or mnemonic) that reflects the ISO/IEC 11179-5 naming principles. If more than one name is provided provide a context to differentiate usage.	ObjectName	0..*
nullSequence	FormatDescription::PhysicalSegmentLayout	A string indicating a null value. From the W3C Recommendation “Metadata Vocabulary for Tabular Data” (https://www.w3.org/TR/tabular-metadata/) 4.3: “null: the string or strings which cause the value of cells having string value matching any of these values to be null.” From the same source, Inherited 5.7: “null: An atomic property giving the string or strings used for null values within the data. If the string value of the cell is equal to any one of these values, the cell value is null. See Parsing Cells in [tabular-data-model] for more details. If not specified, the default for the null property is the empty string ‘’. The value of this property becomes the null annotation for the described column.”	String	0..1
overview	FormatDescription::PhysicalSegmentLayout	Short natural language account of the information obtained from the combination of properties and relationships associated with an object.	InternationalString	0..1
purpose	FormatDescription::PhysicalSegmentLayout	Intent or reason for the object/the description of the object.	InternationalString	0..1
quoteCharacter	FormatDescription::PhysicalSegmentLayout	“The string that is used around escaped cells, or null, set by the quoteChar property of a dialect description. The default is “.”. See W3C Recommendation “Model for Tabular Data and Metadata on the Web”, https://www.w3.org/TR/tabular-data-model/#parsing. From the W3C Recommendation “Metadata Vocabulary for Tabular Data” (https://www.w3.org/TR/tabular-metadata/) 5.9 Dialect: “quoteChar: An atomic property that sets the quote character flag to the single provided value, which MUST be a string or null. If the value is null, the escape character flag is also set to null. The default is ‘”’.” From the CSV Dialect specification (http://specs.frictionlessdata.io/csv-dialect/): “quoteChar: specifies a one-character string to use as the quoting character. Default = “.”	String	0..1	“
skipBlankRows	FormatDescription::PhysicalSegmentLayout	If the value is True, blank rows are ignored. From the W3C Recommendation “Metadata Vocabulary for Tabular Data” (https://www.w3.org/TR/tabular-metadata/) 5.9 Dialect: “skipBlankRows: A boolean atomic property that sets the skip blank rows flag to the single provided boolean value. The default is false.”	Boolean	0..1	false
skipDataColumns	FormatDescription::PhysicalSegmentLayout	The number of columns to skip at the beginning of the row. From the W3C Recommendation “Metadata Vocabulary for Tabular Data” (https://www.w3.org/TR/tabular-metadata/) 5.9 Dialect: “skipColumns: A numeric atomic property that sets the skip columns flag to the single provided numeric value, which MUST be a non-negative integer. The default is 0.” A value other than 0 will mean that the source numbers of columns will be different from their numbers.	Integer	0..1	0
skipInitialSpace	FormatDescription::PhysicalSegmentLayout	If the value is True, skip whitespace at the beginning of a line or following a delimiter. From the W3C Recommendation “Metadata Vocabulary for Tabular Data” (https://www.w3.org/TR/tabular-metadata/) 5.9 Dialect: “skipInitialSpace: A boolean atomic property that, if true, sets the trim flag to ‘start’ and if false, to false. If the trim property is provided, the skipInitialSpace property is ignored. The default is false.” From the CSV Dialect specification (http://specs.frictionlessdata.io/csv-dialect/): “skipInitialSpace: specifies how to interpret whitespace which immediately follows a delimiter; if false, it means that whitespace immediately after a delimiter should be treated as part of the following field. Default = true.”	Boolean	0..1	true
skipRows	FormatDescription::PhysicalSegmentLayout	Number of input rows to skip preceding the header or data. From the W3C Recommendation “Metadata Vocabulary for Tabular Data” (https://www.w3.org/TR/tabular-metadata/) 5.9 Dialect: “skipRows: A numeric atomic property that sets the skip rows flag to the single provided numeric value, which MUST be a non-negative integer. The default is 0.” A value greater than 0 will mean that the source numbers of rows will be different from their numbers.	Integer	0..1	0
tableDirection	FormatDescription::PhysicalSegmentLayout	Indicates the direction in which columns are arranged in each row. From the W3C Recommendation “Metadata Vocabulary for Tabular Data” (https://www.w3.org/TR/tabular-metadata/) 5.3.2: “tableDirection: An atomic property that MUST have a single string value that is one of ‘rtl’, ‘ltr’, or ‘auto’. Indicates whether the tables in the group should be displayed with the first column on the right, on the left, or based on the first character in the table that has a specific direction. The value of this property becomes the value of the table direction annotation for all the tables in the table group. See Bidirectional Tables in [tabular-data-model] for details. The default value for this property is ‘auto’.”	TableDirectionValues	0..1	“Auto”
textDirection	FormatDescription::PhysicalSegmentLayout	Indicates the reading order of text within cells. From the W3C Recommendation “Metadata Vocabulary for Tabular Data” (https://www.w3.org/TR/tabular-metadata/) Inherited 5.7: “textDirection: An atomic property that MUST have a single string value that is one of ‘ltr’, ‘rtl’, ‘auto’ or ‘inherit’ (the default). Indicates whether the text within cells should be displayed as left-to-right text (ltr), as right-to-left text (rtl), according to the content of the cell (auto) or in the direction inherited from the table direction annotation of the table. The value of this property determines the text direction annotation for the column, and the text direction annotation for the cells within that column: if the value is inherit then the value of the text direction annotation is the value of the table direction annotation on the table, otherwise it is the value of this property. See Bidirectional Tables in [tabular-data-model] for details.”	TextDirectionValues	0..1
treatConsecutiveDelimitersAsOne	FormatDescription::PhysicalSegmentLayout	If the value is True, consecutive (adjacent) delimiters are treated as a single delimiter; if the value is False consecutive (adjacent) delimiters indicate a missing value.	Boolean	0..1
trim	FormatDescription::PhysicalSegmentLayout	Specifies which spaces to remove from a data value (start, end, both, neither) From the W3C Recommendation “Metadata Vocabulary for Tabular Data” (https://www.w3.org/TR/tabular-metadata/) 5.9 Dialect: “trim: An atomic property that, if the boolean true, sets the trim flag to true and if the boolean false to false. If the value provided is a string, sets the trim flag to the provided value, which MUST be one of ‘true’, ‘false’, ‘start’, or ‘end’. The default is true.”	TrimValues	0..1	True

Associations

Direction	Association	Description	Multiplicity of UnitSegmentLayout	Package of Other Class	Other Class	Multiplicity of other class	Aggregation Kind	Inherited from
to	InstanceVariable has PhysicalSegmentLayout		0..*	Conceptual	InstanceVariable	0..*	shared	FormatDescription::PhysicalSegmentLayout
from	PhysicalSegmentLayout isDefinedBy Concept	The conceptual basis for the collection of members.	0..*	Conceptual	Concept	0..*	none	FormatDescription::PhysicalSegmentLayout
to	PhysicalLayoutRelationStructure structures PhysicalSegmentLayout		0..1	- own package -	PhysicalLayoutRelationStructure	0..*	none	FormatDescription::PhysicalSegmentLayout
to	PhysicalRecordSegment has PhysicalSegmentLayout		0..1	- own package -	PhysicalRecordSegment	0..*	none	FormatDescription::PhysicalSegmentLayout
from	PhysicalSegmentLayout formats LogicalRecord	Logical record physically represented by the physical layout.	0..*	- own package -	LogicalRecord	0..1	none	FormatDescription::PhysicalSegmentLayout
from	PhysicalSegmentLayout has ValueMapping		0..*	- own package -	ValueMapping	0..*	shared	FormatDescription::PhysicalSegmentLayout
from	PhysicalSegmentLayout has ValueMappingPosition		1..1	- own package -	ValueMappingPosition	0..*	composite	FormatDescription::PhysicalSegmentLayout

Syntax representations / encodings

All syntax representations except the Canonical XMI are provided as reference points for specific implementations, or for use as defaults if sufficient in the form presented.

Canonical XMI

Fragment for the class UnitSegmentLayout (entire model as XMI)

<packagedElement xmlns:StandardProfile="http://www.eclipse.org/uml2/5.0.0/UML/Profile/Standard"
                 xmlns:uml="http://www.eclipse.org/uml2/5.0.0/UML"
                 xmlns:xmi="http://www.omg.org/spec/XMI/20131001"
                 xmi:id="DDICDIModels-DDICDILibrary-Classes-FormatDescription-UnitSegmentLayout"
                 xmi:uuid="http://ddialliance.org/Specification/DDI-CDI/1.0/XMI/#UnitSegmentLayout"
                 xmi:type="uml:Class">
   <ownedComment xmi:id="DDICDIModels-DDICDILibrary-Classes-FormatDescription-UnitSegmentLayout-ownedComment"
                 xmi:uuid="http://ddialliance.org/Specification/DDI-CDI/1.0/XMI/#UnitSegmentLayout-ownedComment"
                 xmi:type="uml:Comment">
      <annotatedElement xmi:idref="DDICDIModels-DDICDILibrary-Classes-FormatDescription-UnitSegmentLayout"/>
      <body>Definition
==========
Description of unit-record ("wide") data sets, where each row in the data set provides the same group of values for variables all relating to a single unit.

Examples
========
A simple spreadsheet. Commonly the first row of the table will contain variable names or descriptions.

The following CSV file has a rectangular layout and would import into a simple table in a spreadsheet::

   PersonId,AgeYr,HeightCm
   1,22,183,
   2,45,175,

Explanatory notes
=================
This is the classic rectangular data table used by most statistical packages, with rows/cases/observations and columns/variables/measurements. Each cell (DataPoint) in the table is the intersection of a Unit (row) and an InstanceVariable. Each logical column will contain data relating to the values for a single variable.</body>
   </ownedComment>
   <name>UnitSegmentLayout</name>
   <generalization xmi:id="DDICDIModels-DDICDILibrary-Classes-FormatDescription-UnitSegmentLayout-generalization"
                   xmi:uuid="http://ddialliance.org/Specification/DDI-CDI/1.0/XMI/#UnitSegmentLayout-generalization"
                   xmi:type="uml:Generalization">
      <general xmi:idref="DDICDIModels-DDICDILibrary-Classes-FormatDescription-PhysicalSegmentLayout"/>
   </generalization>
</packagedElement>

XML Schema

Fragment for the class UnitSegmentLayout (entire XML Schema)

<xs:element name="UnitSegmentLayout"
            type="UnitSegmentLayoutXsdType"
            xml:id="UnitSegmentLayout">
  <!-- based on the UML class DDICDIModels::DDICDILibrary::Classes::FormatDescription::UnitSegmentLayout -->
  <xs:annotation>
    <xs:documentation>Definition
          ==========
          Description of unit-record ("wide") data sets, where each row in the data set provides the same group of values for variables all relating to a single unit.
          
          Examples
          ========
          A simple spreadsheet. Commonly the first row of the table will contain variable names or descriptions.
          
          The following CSV file has a rectangular layout and would import into a simple table in a spreadsheet::
          
             PersonId,AgeYr,HeightCm
             1,22,183,
             2,45,175,
          
          Explanatory notes
          =================
          This is the classic rectangular data table used by most statistical packages, with rows/cases/observations and columns/variables/measurements. Each cell (DataPoint) in the table is the intersection of a Unit (row) and an InstanceVariable. Each logical column will contain data relating to the values for a single variable.</xs:documentation>
  </xs:annotation>
</xs:element>
<xs:complexType name="UnitSegmentLayoutXsdType"
                xml:id="UnitSegmentLayoutXsdType">
  <xs:annotation>
    <xs:documentation>Definition
          ==========
          Description of unit-record ("wide") data sets, where each row in the data set provides the same group of values for variables all relating to a single unit.
          
          Examples
          ========
          A simple spreadsheet. Commonly the first row of the table will contain variable names or descriptions.
          
          The following CSV file has a rectangular layout and would import into a simple table in a spreadsheet::
          
             PersonId,AgeYr,HeightCm
             1,22,183,
             2,45,175,
          
          Explanatory notes
          =================
          This is the classic rectangular data table used by most statistical packages, with rows/cases/observations and columns/variables/measurements. Each cell (DataPoint) in the table is the intersection of a Unit (row) and an InstanceVariable. Each logical column will contain data relating to the values for a single variable.</xs:documentation>
  </xs:annotation>
  <xs:complexContent>
    <xs:extension base="PhysicalSegmentLayoutXsdType">
      
    </xs:extension>
  </xs:complexContent>
</xs:complexType>

Ontology (Turtle)

Fragment for the class UnitSegmentLayout (main ontology)

# class UnitSegmentLayout
# based on the UML class DDICDIModels::DDICDILibrary::Classes::FormatDescription::UnitSegmentLayout
cdi:UnitSegmentLayout
  a rdfs:Class, owl:Class, ucmis:Class;
  rdfs:label "UnitSegmentLayout";
  rdfs:comment "Definition\n==========\nDescription of unit-record (\"wide\") data sets, where each row in the data set provides the same group of values for variables all relating to a single unit.\n\nExamples\n========\nA simple spreadsheet. Commonly the first row of the table will contain variable names or descriptions.\n\nThe following CSV file has a rectangular layout and would import into a simple table in a spreadsheet::\n\n   PersonId,AgeYr,HeightCm\n   1,22,183,\n   2,45,175,\n\nExplanatory notes\n=================\nThis is the classic rectangular data table used by most statistical packages, with rows/cases/observations and columns/variables/measurements. Each cell (DataPoint) in the table is the intersection of a Unit (row) and an InstanceVariable. Each logical column will contain data relating to the values for a single variable."@en;
  rdfs:subClassOf cdi:PhysicalSegmentLayout;
.

JSON-LD

Fragment for the class UnitSegmentLayout (main JSON-LD)

{
  "@context": [
    "PhysicalSegmentLayout.jsonld",
    {
      "cdi": "http://ddialliance.org/Specification/DDI-CDI/1.0/RDF/",
      "xsd": "http://www.w3.org/2001/XMLSchema#",
      "UnitSegmentLayout": "cdi:UnitSegmentLayout",
      
      " comment ": "tag:json-should-support-trailing-commas" 
    }
  ],
  "generatedBy": "This code was generated by the Eclipse Acceleo project UCMIS M2T on 2024-09-23 21:52:59.",
  "basedOn": "based on the UML data type DDICDIModels::DDICDILibrary::Classes::FormatDescription::UnitSegmentLayout"
}