Basic concepts and terminology (for Beginners)


While developing VSAM, the designing team defined some basic important concepts and terms. Before learning more about VSAM, knowing about these concepts and terms will be helpful. Those are -

  • Clusters
  • Components
  • Logical Record
  • Physical Record
  • Control Interval
  • Control Area
  • Spanned Records
  • Alternate Indexes
  • Sphere
  • Splits

Cluster -


A VSAM cluster is a logical definition for a VSAM dataset and has the below one or two components -

  • The data component contains the data records.
  • The index component of the key-sequenced cluster consists of the index records.

IDCAMS DEFINE CLUSTER access-method services used to define VSAM datasets (clusters) with DATA and INDEX components.

For example - Assume MATEGJ.TEST.VSAM is the VSAM KSDS file created.

MATEGJ.TEST.VSAM			-	Cluster Component
MATEGJ.TEST.VSAM.DATA		-	Data Component
MATEGJ.TEST.VSAM.INDEX		-	Index Component

Logical record -


  • A logical record is a logical representation of a physical record used to store the data in a VSAM or retrieve it from VSAM.
  • The application program uses logical records designed by application programmer to access and process the data through I/O operations from the dataset.
  • A logical record can be of a fixed-length or a variable-length, and VSAM supports both type of records based on the file organization type (file type).

For Example - A VSAM file with 47 bytes length logical record defintion as follows -

DATA DIVISION.
FILE SECTION.
FD FILE-1.
	RECORD CONTAINS 47 CHARACTERS.
	BLOCK CONTAINS 470 CHARACTERS.
	DATA RECORD is RECORD-1.
	RECORDING MODE IS F.

01 RECORD-1.
	05 FILE-KEY    	PIC X(03).
	05 FILE-DATA   	PIC X(44).

In the above example, RECORD-1 is the logical record used to process the physical VSAM file.

Physical record -


  • A physical record is an actual record that is stored on the disk allocated for the file.
  • A physical record may be a set of one or more logical records.

For example -

Control Interval -


  • Control interval (CI) is the fundamental block of every VSAM dataset.
  • A CI is the contiguous memory area of DASD (Direct Access Storage Device) used to store the physical records and control information about the records.
  • A CI is a set of physical blocks that are read or written during the I/O operation.
  • The CI size can be from 512 bytes to 32 KB.

The CI components impact the CI size decision are -

  • No. of logical records are stored in CI.
  • Free space for records insertion.
  • Control information - Control information is a combination of two fields -
    • Control Interval Definition Field (CIDF) - CI has only one CIDF. It is a 4-byte field and contains information about the location and amount of free space in the CI.
    • Record Definition Fields (RDFs) - CI can have several RDFs. It is a 3-byte field and describes about the record length.
CI General Format

Control Area -


  • Control Area (CA) is a unique concept to VSAM.
  • A Control Area (CA) is a fixed-length contiguous memory area of DASD formed by two or more Control Intervals (CIs).
  • Generally, CA's maximum size is one cylinder, and the minimum size is one track.
  • The CA size is defined during the dataset definition.
CA General Format

Spanned Records -


  • Spanned records are larger logical records than the CI size.
  • DEFINE CLUSTER should use SPANNED attribute to have spanned records when defining the dataset. Spanned records can store on multiple control intervals (CIs).
  • If the spanned records are used for KSDS, the primary key should be within the first CI.
Spanned Record Example

Notes -

  • A spanned record always begins on a control interval (CI) boundary and fills one or more CIs within a single CA.
  • A spanned record can't share the CI with any other records.
  • The maximum size of the spanned record is the size of the control area (CA).

Alternate Indexes -


  • Alternate indexes (AIXs) allow access to the logical records sequentially or directly using alternate key fields (other than the primary key field).
  • Each alternate index is a KSDS cluster with an DATA and INDEX components.
  • IDCAMS utility is used to define and create AIX.
  • AIX can define and used in three steps, and those are -
    • Create AIX - Defines alternate index (IDCAMS DEFINE command).
    • Build AIX - Creates an alternate index (IDCAMS BLDINDEX command).
    • Define Path - Defines the mapping between primary and alternate keys to access the records faster (IDCAMS DEFINE PATH).

For example -

AIX example

Notes -

  • Any key except the primary key in the base cluster can use as an alternate key.
  • The alternate key can have duplicate values.
  • The primary keys are in ascending order within the alternate index value. The alternate index is also in ascending order.

Sphere -


  • A sphere is a group of base clusters and their associated clusters (AIX).
  • The associated clusters are the base cluster's alternate indexes (AIX). i.e., base cluster plus all its AIXs.
Sphere Example

Splits -


CI split occurs when there is not enough space to process the below two requests -

  • Inserting a new record at the end of CI.
  • An existing record length expanded.

If any CI is free in the same CA, approximately half of the records from fully loaded CI move to other free CI. In the CI split, both CIs belongs to the same CA.

CI Split Example

Similarly, if there is not enough space in all the CIs of a CA, then CA is split occurs. Approximately half of the CIs of fully loaded CA data move to the other free CIs of different CA.

CA Split Example

The split worsens the performance. CI and CA split occur in KSDS and VRRDS datasets.