SLICK(ID:8196/)

Country: us
- languages for us
- us/1972
Began: 1972

SLICK II

Evolution of

References:

Copeland, George P. and Su, Stanley Y. W. "A high level data sublanguage for a context-addressed segment-sequential memory", pp265-276 view details Abstract: This paper deals with the problems of data base translation for achieving data sharing through a computer network. A semiautomatic data base translation procedure and its prototype implementation are described. The procedure takes advantage of data conversion capabilities already existing in programming languages and I/O control systems and of man-machine interaction to achieve data base translation tasks. The user of one system is allowed to browse, retrieve, edit, format and restructure the data acquired on-line from another system to produce a new data base suitable for his own application programs. The procedure attempts to bypass the complex task of formally describing and translating several levels of data representation commonly undertaken in the existing data base translation systems. Extract: Introduction
Introduction
Recent advancements in computer and communication technologies have made man-machine and machine-machine communications possible through interconnected computer networks. Through a computer network, computer resources such as hardware devices, software systems, application programs as well as data files can be shared among computer users throughout the country (Roberts et al. 1970) and in different parts of the world. Also, through a computer net and its sophisticated peripherals, men can work closer than ever before not only to their data through man-machine interaction but also to their fellow workers through remote information inquiries and conferences. Therefore, the keywords of the network concept can be considered as human and computer resource sharinq and man-machine interaction.
The most well known computer network project in the country is the ARPANET which interconnects large-scale general purpose computers for computer resource sharing. Since its early stage of development, the project research has been devoted to problems such as network topology (Frank and Chou 1972), interface message processor (Heart et al. 1970), hardware connections and network communications (Frank, Kahn and Kleinrock 1972) and modelling and simulation of networks (Kleinrock 1969, 1970). The problems dealing with the actual application of the network are the major forthcoming tasks to be undertaken.
One major application of a computer network is to allow the users of one site to inquire, acquire and use data available at the other sites which are relevant to some form of decision making. To allow data sharing through interconnected computer network, one important problem needs to be solved. That is on-line data base translation.
Data translation is a mechanism by which data stored in different logical and/or physical structures on one machine can be converted into a form logically and physically suitable for the application programs running on another machine. This mechanism is necessary so that a user of one node in the computer net can call up data from another node on-line. The data base translation procedure has to take into consideration the following problems: i) the different data structures (logical relation of data) used in the source and target data bases, 2) the different storage structures of the data due to hardware discrepancies of network nodes, 3) the provision to allow the inquirer to edit the data acquired, i.e., to delete or modify data obtained from the source and to add his own data to compose the target data base.
It has been recognized by the information data processing profession that there can be at least three levels of structural representation of data, namely information structure (user's view data), data (logical) structure and storage (physical) structure. The information structure at the user's level is quite different from the data structure designed for efficient access which is the data representation at the access path level. The data structure is then implemented and mapped into machine dependent storage structure. These three levels of data representation are found to be essential in the design of a file (Wang and Lum 1971). At the access path level of design, pointers, cross reference indexes and inverted records are often introduced and incorporated into the data structures in order to speed up data access. Thus, in general, data structures are quite complex and have a great number of them. Each structure, when implemented on different machines, will look very much different from one another in its storage representation. In order to allow a user at one node to call up data available at another node and use the data as input to his own program, the complex data structures and storage structures of both the source and the target data base have to be known to the translation procedure so that proper conversion can be carried out.
Also data available at another node may not be entirely useful to the user making requests. Some of the data may have to be deleted and new data added in order to form the user's data base. Thus, the translation procedure should not only translate from one structure to the other, but should also provide editing facilities to the user. Extract: Some Existing Works
II. Some Existing Works
The problem of data base translation is not new. Every time a new generation of computers is manufactured, data from the old data bases of many old systems often have to be translated or regenerated. However, in the past, data translation has been done by programs written to carry out specific translation tasks to meet the needs of individual installations. There are a few recent attempts (Smith 1972, Koch et al. 1972, Fry et al. 1972 and Altman et al. 1972) to design a language (data definition language) for describing the data and storage structure of a system. The language is viewed as a first step toward formal descriptions of data bases and thus provides a means of establishing the mapping relationships between two data bases. There are also attempts to develop a generalized data translator. The work by Smith (1971, 1972 and 1973) is significant. A translation processor has been designed and is being implemented to accept as input three descriptions and the set of data to be translated. The first description specifies the logical and physical structure of the data to be translated. The second description characterizes the data structure of the target data base. The third description identifies elements and values to be translated. The output is a special purpose translator for translating all sets of data satisfying the source description. Another large project toward building a generalized translator is being carried out by Merten and Fry's group at the University of Michigan.
Sibley and Merten (1973) proposed a model of data access and translation which is a fully automatic procedure for translating one data base to the other. In the proposed procedure, many levels of data access and mapping are suggested. The procedure concerns three levels of structural descriptions: logical structure of data, physical storage in core and physical structure in secondary storage. Data Description tables are used on each level to describe the structural relationship of data items. Data access can take place on any of the three levels.
The work reported by the Michigan group (1973) is an attempt at the development of a prototype system as the first step towards developing a generalized data translator. Its inputs and outputs have been restricted to those generated by one data base system (NIPS, which operates on an IBM/360 computer) into input files for the new data base system (WWDMS, which operates on a Honeywell HIS 6050 computer). The experiment has been quite successful. However, the number of data structures that the prototype system currently handles is very limited. The entire project is a very ambitious attempt and is rather complex. How efficient the final system will be is still awaiting answer.
Both automatic procedures (the Pennsylvania and the Michigan translators) are data description-driven translators. They do not, in our opinion, take advantage of the data description and conversion capabilities already existing in the I/O control system routine and in all compilers for the programs using or generating the data bases. Present versions of the generalized translators operate, in effect, "off-line". They cannot produce translated data in realtime.
Also the procedures do not allow editing of the data in the source data base, merging of data bases and adding new data to form the target data base. A semi-automatic procedure we are about to describe is believed to be simple to implement and efficient so that on-line processing requirements can be met.
Another project described by Marcus (1973) is being carried out at MIT which deals with the translation problem of a network of heterogeneous interactive information retrieval systems. The system is designed for data sharing of bibliographic data bases such as INTRIX and MEDLINE. The approach is to translate the command language source system to a common intermediate command language before mapping into the command language of the target system for retrieval Translation process is done by the central control system of a network with the "STAR" configuration. Work in progress is on the design of the common command language and the translation of indexin vocabulary. Data base translation at logical as well as physical structure levels is not performed. The system is designed to operat on information systems in which command languages and index language are well defined and somewhat related. It is different from the general purpose type of data base translation system that this paper deals with.
Extract: A Semi-automatic On-line Translation System
III. A Semi-automatic On-line Translation System
We believe that some of the problems and complexities described above can be alleviated and simplified, if the translation process can involve the inquirer himself in making some decisions about the structure for his data base and in specifying the editing operations required. Our view is that data base translation can be done most cost-effectively by a semi-automatic procedure in which man and machine's capabilities are best utilized. The design of a semiautomatic data base translation system is described below. A prototype implementation of this system is presented in the next section.
Figure 1 shows the major software components and data files required in two computer systems performing data sharing. System A is making a data request to system B. It is assumed that terminals such as CRT display or teletypes are available for the user to make the data inquiry and translation request. A network communication program at each node will handle the hand-shaking operation, code conversion, line protocols, etc., to establish the communication link. It will also handle protection measures such as checking user identification, authentication and authorization for requesting data. An authorized user can then use a retrieval and editing language to inquire about the data files available in the other (system B) which will provide a brief description about the contents of all data files to be shared.
After browsing through the contents, the user can make a request to access a specific data file to system B. In system B, a common caretaker program is written to perform the following functions:
(1) Communicates with the inquirer and transfers to him a description of the contents of a record (i.e., the data field names and their descriptions).
(2) receives requests of specific data fields made by the user and calls a proper access routine to retrieve a record from secondary storage and extracts the values of the field names specified, and
(3) transmits the field names and values in a standardized format to system A for display. The access routine associated with each data file is a simple program to write. It can be written in the same language as the program which generates the original data file being requested. It can make use of the same variables, arrays and other data declarations of the generation program and, in fact, keep its skeleton and change the write statements to read statements to move data into the variables, arrays or structure names. Each access routine handles only the idiosyncracies of the data set for which it is written.
The access routine for a file can be written by someone who knows the format of data on the file. When provided by the owner of the file, it allows data in the file to be accessed without specifically using a data description language to describe the structure of the data file to the requestor. It also allows the owner to protect the data items which are not to be shared by simply programming the access routine to by-pass these data.
The data extracted from system B is stored in a temporary data file in system A. The user in system A can then use a retrieval and editing language to display and edit the data. The language allows the user to specify data deletion, data type conversion, data reformatting and renaming and other editing commands related to the rearrangement and modification of data acquired from system B. It also allows the user to specify editing commands which will cause updating data (user's own data) to be incorporated into the data acquired. By using the language the user is free to compose his own data structure suitable for his application programs.
The user's browsing, editing and retrieval commands are first scanned and edited by a Command Line Interpreter which stores the edited commands in a temporary space. The edited commands are then interpreted by Retrieval and Editing Handler (see Figure i) which either passes the commands to the source system B or carries out the editing operations to interact with the user and compose the first record of the target data base. The Handler also stores those commands useful for subsequent automatic translation in a separate file. Browsing commands, for example, will not be useful for translating subsequent records.
The commands for automatic translation will then be interpreted by the automatic Translator which calls up the routines in the Retrieval and Editing Handler to process the commands. The execution of these commands causes the same retrieval requests to be made automatically to system B which extracts data from the next record in the source file and transmits the retrieved data to the temporary storage. The same editing operations are then performed to compose the next record for the target data base. Thus, once the first record is produced, the translation procedure becomes fully automatic.
It should be noted that the user has the total freedom of designing and composing his own data structure. He can add an array of pointers, set up an inverted record, implement trees, lists, graphs or other data structures. The user should know the implementation-of the structure he chooses at the program language level, i.e., he should provide the data for arrays, variables, strings or other data types which he will be using in his application program. He does not have to deal with the physical structure of his data since the compiler of his application program will handle the physical structure for him. The application program will read the target data file according to the format, data types and order of the data fields
in [ACM SIGFIDET] Proceedings of the ACM SIGFIDET workshop on Data description, access and control, 1974 view details