Memory Allocation with COBOL

Generally, the use of a table/array (Static Memory) is most common in COBOL modules in an application system. When a space abend (SOC4) occurs, due to the exceeding of a dimension of an array, the developer must have to find each & every place in the legacy system where that table is declared and used, and then to increase the dimension by a sufficient amount. But still what should be the sufficient amount is often unclear. Again all the related modules need to be recompiled & tested. The bottom line is the exceeding of a dimension of a table can be a very costly maintenance item.

Some common questions come to the programmer’s mind while using a table/array in a COBOL Module, as below.

Is it ever accurately possible to determine a maximum table size that does not waste memory?
Will it suffice enough to hold large additional data items during the lifetime of a legacy system?
Doesn’t the exceeding of array dimension crop up inevitably and repeatedly over time? And isn’t it a pain to repair?

Here comes the concept of Dynamic Memory Allocation to handle the Static allocation in a more efficient and cost effective way.

Dynamic Memory Allocation

A dynamically allocated area uses just the amount of memory necessary to hold all data items during runtime of the program. A dynamically allocated area is thus efficient with regard to memory usage and OS paging. Dynamic memory allocation is when an executing program requests that the operating system give it a block of main memory. Usually the purpose is to add a node to a data structure.

Advantage of Dynamic Memory Allocation

Consider a COBOL table/array is defined to occur 1000 items in order to hold an expected number data item and considering the future growth of the table. Then the difference between the actual number of items (n* size of table item) and the maximum dimension of the table (1000 * size of table item) is wasted memory. Such wasted memory has implications with regard to paging by the operating system and can degrade overall system performance. The bigger the declared area for the table, the more pages/memory needed to hold the table.

Take an example of two subroutines statically utilizing 1000 item tables. In the case of Static Memory allocation the subroutine will require 2*1000* the size of a table element. But in case of dynamic memory allocation, the memory will be allocated to the first table and then the memory will be freed which can be used by the 2nd table later. The maximum memory used at any point of time by the program in case of Dynamic Allocation will be 1000 * the size of a table element. Hence, Dynamic allocated tables will use only half as much memory as statically allocated one.

The dynamically allocation memory uses just the amount of memory necessary to hold all data items. A dynamically allocated area is thus efficient with regard to memory usage and operating system paging.

Some commonly used Dynamic Allocation concepts and techniques in COBOL are

Pointer
Heap

POINTER

A pointer variable is a reference or a table variable that stores a memory address. This address can be arbitrarily changed, enabling the contents of any accessible memory location to be addressed and manipulated directly. For example, when a program requests additional memory during runtime from the OS, the address of the starting location to the new, dynamically allocated variable is returned. Since the value stored in a preexisting compiler-allocated pointer variable can be altered, the running program can preserve the address passed back by the OS.

However, if the pointer variable was defined within an inner block, it will be destroyed when the block terminates. The location of the dynamically allocated memory is then lost and the memory cannot subsequently be accessed or later freed.

Consider going to a Library to find a book about a specific subject matter. Most likely, you will be able to use some kind of electronic reference or a catalog, to determine the title and author of the book you want. Since the books are typically shelved by category, and within each category sorted by author’s name, it is a fairly straightforward and painless process to then physically select your book from the shelves. Now, suppose instead you came to the library in search of a particular book, but instead of organized shelves, were stored with large bags lining both sides of the room, each arbitrarily filled with books that may or may not have anything to do with one another. It would take hours, or even days, to find the book you needed, a comparative eternity. This is how software runs when data is not stored in an efficient format appropriate to the application.

When setting up data structures Linked List, Queues and Trees, it is necessary to have pointers to help manage how the structure is implemented and controlled. Typical examples of pointers are start pointers, end pointers, and stack pointers.

These pointers can either be

1. Absolute Pointer: - The actual physical address or a virtual address in virtual memory.

2. Relative Pointer :- An offset from an absolute start address (Base Address)

The COBOL programming language supports pointers to variables. Primitive or group (record) data objects declared within the LINKAGE SECTION of a program are inherently pointer-based, where the only memory allocated within the program is space for the address of the data item (typically a single memory word). In program source code, these data items are used just like any other WORKING-STORAGE variable, but their contents are implicitly accessed through their LINKAGE pointers. Memory space for each pointed-to data object is typically allocated dynamically using external CALL statements or via embedded extended language constructs such as EXEC CICS or EXEC SQL statements. Extended versions of COBOL also provide pointer variables declared with USAGE IS POINTER clauses. The values of such pointer variables are established and modified using SET and SET ADDRESS statements. Some extended versions of COBOL also provide PROCEDURE-POINTER variables, which are capable of storing the addresses of executable code.

Defining a COBOL Pointer

A pointer is a 4-byte elementary item that can be compared for equality, or used to set the value of other pointer items and can be defined in two ways.

With the USAGE POINTER clause. The resulting data item is called a pointer data item.
With the USAGE PROCEDURE-POINTER clause. The resulting data item is called a procedure-pointer data item.

A pointer or procedure-pointer data item can be used only in:

A SET statement
A relation condition (only Equal To)
The USING phrase of a CALL statement, or the Procedure Division header
The operand for the LENGTH OF and ADDRESS OF special registers.

Pointer data items are defined explicitly with the USAGE IS POINTER clause, and are implicit when using an ADDRESS OF special register or the ADDRESS OF an item.

If a group item is described with the USAGE IS POINTER clause, the elementary items within the group item are pointer data items. The group itself is not a pointer data item, and cannot be used in the syntax where a pointer data item is allowed. The default value of a COBOL pointer is NULL.

In the above example, the PTRA is a 77-level data item. So the ADDRESS OF PTRA is the ADDRESS OF special register. Because a special register is an actual storage area, the SET statement moves the contents of ADDRESS OF AVAR into pointer data item APTR.

Pointers can be defined at any level (except 88) in the FILE, WORKING-STORAGE, or LINKAGE SECTIONS of a program. When a pointer is referenced it must be on a 16-byte storage boundary. Pointer alignment refers to the compiler's process of positioning pointer items within a group item to offsets and should be multiples of 16 bytes from the beginning of the record. If a pointer item is not on a 16-byte boundary, a pointer alignment exception is sent to the program.

HEAP Allocation in COBOL

Heap memory is a common pool of free memory used for dynamic memory allocations within an application. Dynamic memory allocation is the allocation of memory that is dynamically allocated and deallocated by the application. The term heap memory is used because most implementations for dynamic memory allocation make use of a binary tree data structure called a heap.

There are three primary operations performed on a heap.

a. Allocation. This operation reserves a block of storage of a given size and returns a pointer to the reserved block. If no free element is large enough to satisfy the request, an additional heap segment is allocated via GETMAIN.

b. Deallocation. This operation returns a given block of storage (previously allocated by the application) to the heap. When a block of storage is deallocated, ownership of the block is returned to the heap.

c. Reallocation. This operation resizes a given block of storage to a new size and returns a (possibly updated) pointer to the block of storage.

The HEAP is an area of storage that is allocated to contain data that will remain available until the MAIN program terminates non-LIFO storage. For COBOL programs compiled with RENT (Re-entrant) WORKING-STORAGE is in the HEAP, however, if the program in not Re-entrant (NORENT) then the working-storage is in the load module.

The COBOL provides two common functions to allocate and free storage.

· The subroutine CEEGTST allocates storage in the user heap and returns the address of that storage.

· The subroutine CEEFRST releases storage back to the user heap.

Storage that is not released during program execution will be released when the main program terminates.

Heap Vs Stack – The Difference

The stack is the memory set aside as scratch space for a thread of execution. The stack is always reserved in a LIFO (Last in First Out) order; the most recently reserved block is always the next block to be freed. This makes it really simple to keep track of the stack; freeing a block from the stack is nothing more than adjusting one pointer.

The heap is memory set aside for dynamic allocation. Unlike the stack, there's no enforced pattern to the allocation and deallocation of blocks from the heap; you can allocate a block at any time and free it at any time. This makes it much more complex to keep track of which parts of the heap are allocated or free at any given time.

Each thread gets a stack, while there's typically only one heap for the application (although it isn't uncommon to have multiple heaps for different types of allocation).