VSAM Interview Questions

What is VSAM?

Virtual Storage Access Method

What are the types of VSAM datasets?

Entry sequenced datasets (ESDS), key sequenced datasets (KSDS) and relative record dataset (RRDS).

How are records stored in an ESDS, entry sequenced dataset?

They are stored without respect to the contents of the records and in the order in which they are included in the file

What is a CI, control interval?

A control interval is the unit of information that VSAM transfers between virtual and auxiliary storage.

What are the distinctive features of a ksds, key sequenced dataset?

The index and the distributed free space.

What is a CA, control area?

A group of control intervals makes up a control area.

What is a cluster?

A cluster is the combination of the index, sequence set and data portions of the dataset. The operating system gives program access to the cluster, ie. to all parts of the dataset simultaneously.

What is the catalog?

The catalog contains the names of all datasets, VSAM and non-VSAM. It is used to access these datasets.

What is an alternate index?

An AIX is a file that allows access to a VSAM dataset by a key other than the primary one.

What is a path?

A path is a file that allows you to access a file by alternate index - the path provides an association between the AIX and the base cluster.

What is the upgrade set?

The upgrade set is the list of all AIXes that VSAM must maintain for a specific base cluster, so that when data in the base cluster is updated, the AIX files are also updated.

What is free space?

Free space is reserved within the data component of a KSDS to accommodate inserting new records.

What is a VSAM split?

If there isn't enough space in the control interval VSAM performs a control interval split by moving some records to the free control intervals. If there isn't a free control interval VSAM performs a control area split by allocating a new control area and moving half of the control intervals to it.

What is the base cluster?

The base cluster consists of the data component and the index component for the primary index of a KSDS

Do primary key values have to be unique? Do alternate key values have to be unique?

Primary key values must be unique; alternate key values need not be.

In the COBOL SELECT statement for a KSDS what are the three possibilities for ACCESS?


What is the COBOL RECORD KEY clause?

The RECORD KEY in the SELECT clause identifies the files primary key as it will be known to the program.

What is the purpose of the FILE STATUS clause in the SELECT statement?

The FILE STATUS field identifies the field that VSAM uses to provide information about each I/O operation for the file.

Explain the meaning and syntax for the START command.

The START command is used read other than the next VSAM record. A value must be moved into the RECORD KEY. The KEY clause is optional, but it can be used to specify a relational (equal, less than, etc.) operator.

What is the meaning of dynamic processing?

It's rarely used. It means one program uses both sequential and random processing for a VSAM KSDS file.

Name some common VSAM error conditions and codes.

They are end of file (10), duplicate key (22), record not found (23), VSAM logic error (90), open problem (92) and space problem (93).

What is the VSAM-code field?

It is a COBOL II enhancement to VSAM batch processing expanding the FILE STATUS field. It is defined in WORKING-STORAGE as a six byte group item with three two byte elements, the normal return code, the function code and the feedback code.

What is a VSAM slot?

A relative record dataset (RRDS) consists of a specified number of areas called slots. Each slot is identified by a relative record number (RRN) which indicates its relative position in the file.

What is the utility program closely associated with VSAM?

IDCAMS, the access method services utility.

What are the three levels of definition for the VSAM DEFINE?


What is the significance of the SHAREOPTIONS parameter?

It specifies how the file may be shared between jobs and between batch and CICS environments.

What is File Status in VSAM?

The FILE STATUS clause of the FILE-CONTROL paragraph allows for each file to be associated with a file status key. If the FILE STATUS clause is specified for a given file, a value indicating the status of each I/O operation against that file is placed in the associated file status key. This value is stored in the file status key as soon as the I/O operation is completed.

What's a LDS (Linear Data Set) and what's it used for?

LDS is a VSAM dataset in name only. It has unstructured 4k (4096 bytes) fixed size CI’s which do not contain control fields and therefore from VSAM's standpoint they do not contain any logical records. There is no free space, and no access from COBOL. Can be accessed by DB2 and IMS fast path datasets. LDS is essentially a table of data maintained on disk. The 'table entries' must be created via a user program and can only be logically accessed via a user program. When passed, the entire LDS must be mapped into storage, and then data is accessed via base and displacement type processing.

What is Control Interval, Control Area?

Control Interval is analogous to a physical block for QSAM files. It is the unit of I/O. Must be between 512 bytes to 32 k. Usually either 2K or 4K. A larger control interval increases performance for sequential processing while the reverse is true for random access. Under CICS when a record is locked, the entire CI gets locked.

Control Area is a group of control intervals. CA is used during allocation. CA size is calculated based on the allocation type (cyl, tracks or records) and can be max of 1 cylinder


Coded in the DEFINE as FREESPACE(ci ca) where ci is the percentage of each control interval to be left free for insertions, ca is the percentage of control intervals in each control area to be left empty.

Would you specify FREESPACE for an ESDS?

No. Because you cannot insert records in an ESDS, also when you rewrite a record, it must be of the same length. Thus putting any value for free space does not make any sense.


SHAREOPTS is a parameter in the DEFINE and specifies how an object can be shared among users. It is coded as SHAREOPTS(a b), where a is the cross region share option ie how two or more jobs on a single system can share the file, while b is the cross system share option ie how two or more jobs on different MVS’s can share the file. Usual value is (2 3).

What is the meaning of each of the values in SHAREOPTS(2 3)?

Value of 2 for cross region means that the file can be processed simultaneously by multiple users provided only one of them is an updater. Value of 3 for cross system means that any number of jobs can process the file for input or output

What happens when you open an empty VSAM file in a COBOL program for input?

A VSAM file that has never contained a record is treated as unavailable. Attempting to open for input will fail. An empty file can be opened for output only. When you open for output, COBOL will write a dummy record to the file & then delete it out.

How do you initialize a VSAM file before any operation? a VSAM with alternate index?

Can write a dummy program that just opens the file for output and then closes it.

What does a file status of 02 on a VSAM indicate?

Duplicate alternate key . Happens on both input and output operation

How do you calculate record size of an alternate cluster?

Unique Case: 5 + ( alt-key-length + primary-key )

Non unique Case: 5 + ( alt-key-length + n * primary-key ) where n = number of duplicate records for the alternate key

What is the difference between sequential files and ESDS files?

Sequential (QSAM) files can be created on tape while ESDS files cannot. Also, you can have ALTINDEX for an ESDS while no such facility exists for QSAM files.

How do you define a GDG ?

Use the DEFINE GENERATIONDATAGROUP command. In the same IDCAMS step, another dataset must be defined whose DCB parameters are used when new generations of the GDG are created. This dataset is known as the model dataset. The ds name of this model dataset must be the same as that of the GDG, so use a disp of keep rather than catlg and also specify space=(trk,0)

If FSPC(100 100) is specified does it mean that both the control interval and control area will be left empty because 100 % of both CI and ca are specified to be empty?

No, they would not be left empty. one record will be written in each CI and 1 CI will be written for each ca.

Do all versions of the GDG have to be of the same record length ?

No, the DCB of the model dataset can be overridden when you allocate new versions.

How are different versions of GDG named ?

base-file-name.GnnnnnV00 where nnnn= generation number (upto 255). nnnn will be 0000 for the 1st generation.

Suppose 3 generations of a GDG exist. How would you reference the 1st generation in the JCL?

Use GDG name(-2).

What more info you should give in the DD statement while defining the next generation of a GDG?

Give (+1) as the generation number, give (new,catlg) for disp, give space parameter, can give the DCB parameter if you want to override the dcb of the model dataset.

Assuming that the DEFINE JCL is not available, how do you get info about a VSAM file's organization?

Use the LISTCAT command.

During processing of a VSAM file, some system error occurs and it is subsequently unusable. What do you do ?


How do you fix the problem associated with VSAM out of space condition?

Define new VSAM dataset allocated with more space.
Use IDCAMS to REPRO the old VSAM file to new VSAM dataset.
Use IDCAMS to ALTER / rename the old VSAM dataset or se IDCAMS to DELETE the old VSAM dataset.
Use IDCAMS to ALTER / rename the new VSAM dataset to the name of the original VSAM dataset.

What is the meaning of VSAM RETURN-CODE 28?

Out of space condition is raised.

How many Alternate Indexes you can have on a dataset?


Is it slower if you access a record through ALT INDEX as compared to Primary INDEX?

Yes. Because the alternate key would first locate the primary key, which in turn locates the actual record. Needs twice the number of I/Os.

What is RECOVERY and SPEED parameters in DEFINE CLUSTER command?

RECOVERY (default) and SPEED are mutually exclusive. Recovery pre formats the control areas during the initial dataset load, if the job fails, you can restart but you must have a recovery routine already written to restart the job. SPEED does not pre format the CAs. It is recommended that you specify SPEED to speed up your initial data load.

Describe SHAREOPTIONS parameter (SHR) in Define Cluster command.

It defines the cross-region and cross-system sharing capabilities of the dataset. Syntax is SHR (Crvalue, CSvalue) value 1 means multiple read OR single write (read integrity) 2 means multiple read AND single write (Write integrity) 3 means Multiple read AND multiple write 4 is same as 3, which refreshes the buffer with every random access. Default is SHR(1 3).

What does the KEYRANGES parameter in Define Cluster commend do?

It divides a large dataset into several volumes according to the Key ranges specified.

What are the optional parameters to the input dataset While loading the empty cluster with the data records?

2)TOADDRESS(address) where 'address' specifies the RBA value of the key of the input record.
4)TONUMBER(rrn) where 'rrn' specifies the relative record number of the RRDS record
6)TOKEY(key) where 'key' specifies the key of the input record
8)COUNT(number) where 'number' specifies the number of records to skip or copy

Is a delete operation possible in an ESDS?B

No delete operation is not possible in VSAM ESDS.

How many buffers are allotted to VSAM KSDS and ESDS?

2 data buffers by default for ESDS. For KSDS it allots 2 data buffers and 1 index buffers. each buffer is about 4k.

what's the biggest disadvantage of using a VSAM dataset?


How many times secondary space allocated?


what is the RRN for the first record in RRDS?