Introduction to teradata utilities : Multiload
DEFINITION
Teradata MultiLoad, also called "MultiLoad," "MLoad" or "ML," is a command-driven parallel load utility for high-volume batch maintenance on multiple tables and views of the Teradata Database. It is specially designed for high-speed batch creation and maintenance of large databases.
OVERVIEW
Teradata MultiLoad executes a series of MultiLoad commands and Teradata SQL statements written in a batch mode job script or interactively entered. The MultiLoad commands provide the session control and data handling specifications for the data transfer operations, and the Teradata SQL statements perform the actual maintenance functions on the Teradata RDBMS tables and views.
A single MultiLoad job performs a number of different import and delete tasks on database tables and views:
• Each MultiLoad import task can do multiple data insert, update, and delete functions on up to five different tables or views;
• Each MultiLoad import task can have up to 100 DML steps;
• Each MultiLoad delete task can remove large numbers of rows from a single table.
Teradata MultiLoad is characterized by its serialization feature. In certain scenarios, multiple changes are possible to occur to a single row in the same MultiLoad job. By using its serialization feature, Teradata MultiLoad automatically guarantees that the ordering of such operations is maintained correctly.
If a MultiLoad job has only one import task, then Teradata FastLoad has to be chosen for it, since FastLoad will finish it much faster. However, the FastLoad utility works only on empty tables.
A FastLoad job can not be used to:
- • Insert additional data rows into an existing table;
- • Update individual rows of existing tables;
- • Delete individual rows from existing tables;
- • Load data into multiple tables.
DATA SOURCE
Teradata MultiLoad imports data from:
• Disk or tape files
• Input files on a network-attached workstation;
• Special input module (INMOD) routines that select, validate, and preprocess input data;
• Access modules;
• Any device providing properly formatted source data;
The table or view in the database receiving the data can be any existing table or view for which the user has access privileges for the required maintenance tasks.
MultiLoad command IMPORT has an optional parameter FORMAT, which is used to specify the format of the external data source.
The format may be:
• FASTLOAD;
• BINARY;
• TEXT;
• UNFORMAT;
• VARTEXT.
OPERATING MODES
Teradata MultiLoad runs in the following operating modes:
• Interactive mode;
• Batch mode.
In interactive mode, Teradata MultiLoad uses terminal screen and keyboard as the standard output and input streams. In batch mode, MultiLoad uses > and < to redirect the standard output / input streams. To invoke Teradata MultiLoad in interactive mode, enter the following command (for Microsoft Windows): c:\ncr\bin\MultiLoad
To invoke MultiLoad in batch mode, use the command below (for Microsoft Windows): c:\ncr\bin\MultiLoad [options] < infile > outfile
Here, the infile is a Teradata MultiLoad job script file which includes all the required MultiLoad commands and Teradata SQL statements, whereas the outfile is the outfile is the output stream file.
Character Sets
Teradata MultiLoad supports Latin, Chinese, Japanese and Korean character sets, along with ASCII, EBCDIC, UTF-8 and UTF-16. It also supports user-defined session character sets.
Task Status Reporting
Teradata MultiLoad has three reporting methods to monitor the status of jobs that are still in progress and those that have just completed:
• MultiLoad utility messages provide job status reports, including:
o Options messages that list the settings of MultiLoad task parameters;
o Logoff/disconnect messages that list key run statistics.
• QrySessn (i.e., Query Session Utility) provides real-time, phase-oriented progress reports at selected intervals during the MultiLoad job;
• Statistics messages present information at the end of the MultiLoad job.
Data Conversion Capabilities
Teradata MultiLoad can redefine the data type specification of input data to match it with that of the destination column in the MultiLoad table on the target database. MultiLoad data conversion types are:
• Numeric-to-numeric (for example integer-to-decimal);
• Character-to-numeric;
• Character-to-date;
• Date-to-character.
Use the DATADESC specification of the MultiLoad FIELD command to convert input data to a different data type before inserting it into the MultiLoad table on the Teradata Database.
Note: Redundant conversions, like integer-to-integer, are legal and necessary to support the zoned decimal format.
MULTILOAD COMMANDS
MultiLoad Command Summary
The MultiLoad commands perform two types of activities:
• Support activities — Support commands perform support activities to establish the MultiLoad sessions with the Teradata Database and define the operational support environment for Teradata MultiLoad. Established support environment options remain in effect until another support command changes them. Support commands do not specify a MultiLoad task;
• Task activities — Task commands perform task activities to specify the actual processing of each MultiLoad task.
1. Teradata MultiLoad Commands for Support Activities:
COMMAND NAME | FUNCTION |
ACCEPT | Allows the value of one or more utility variables to be accepted from either a file or an environment variable |
DATEFORM | Specifies the form of the DATE data type specifications for the MultiLoad job |
DISPLAY | Writes messages to the specified destination |
ELSE (see IF, ELSE, and ENDIF) | Introduces commands and statements that execute when a preceding IF condition is false |
ENDIF (see IF, ELSE, and ENDIF) | Delimits the group of MultiLoad commands that were subject to previous IF or ELSE conditions |
IF (see IF, ELSE, and ENDIF) | Introduces a conditional expression whose value initiates execution of subsequent commands |
LOGOFF | Disconnects all active sessions and terminates MultiLoad on the client system. |
LOGON | Specifies the LOGON command string to be used in connecting all sessions established by MultiLoad. |
LOGTABLE | Identifies the table to be used to journal checkpoint information required for safe, automatic restart of MultiLoad when the client or Teradata Database system fails. |
ROUTE MESSAGES | Identifies the destination of output produced by MultiLoad support environment. |
RUN FILE | Invokes the specified external file as the current source of utility commands and Teradata SQL statements |
SET | Assigns a data type and a value to a utility variable |
SYSTEM | Suspends operation of MultiLoad and executes any valid local operating system command |
2. Teradata MultiLoad Commands for Task Activities:
COMMAND NAME | FUNCTION |
BEGIN MLOAD and BEGIN DELETE MLOAD | Specifies:
|
DML LABEL | Defines a label and error treatment options for a following group of DML statements. |
END MLOAD | Indicates completion of MultiLoad command entries and initiates execution of the task. |
FIELD | Used with the LAYOUT command to define a field of the data source record that is sent to the Teradata Database. |
FILLER | Used with the LAYOUT command to define a field in the data source record that is not sent to the Teradata Database. |
IMPORT | Identifies the data source, the layout used to describe the data record, and optional conditions for performing DML operations. |
LAYOUT | Introduces the record format of the data source to be used in the MultiLoad task. This command is followed by a sequence or combination of FIELD and TABLE commands. |
LOGDATA | Supplies parameters to the LOGMECH command beyond those needed by the logon mechanism, such as userid and password, to successfully authenticate the user. |
LOGMECH | Identifies the appropriate logon mechanism by name. |
PAUSE ACQUISITION | Temporarily suspends the acquisition phase of a MultiLoad job. |
TABLE | Used with the LAYOUT command to identify a table whose column names and data descriptions are used as the field names and data descriptions of the data source records. |
Teradata SQL Statements in MultiLoad
MultiLoad supports a subset of Teradata SQL statements in order to define and manipulate the data stored in the Teradata Database. Consequently, other utilities do not have to be invoked to perform routine database maintenance functions before executing MultiLoad utility tasks. The Teradata SQL statements supported by Teradata MultiLoad are listed below:
COMMAND NAME | FUNCTION |
ALTER TABLE | Changes the column configuration or options of an existing table |
CHECKPOINT | Adds a checkpoint entry to a journal table |
COLLECT STATISTICS | Collects statistical data for one or more columns of a table |
COMMENT | Stores or retrieves comment string associated with a database object |
CREATE DATABASE CREATE MACRO CREATE TABLE CREATE VIEW | Creates a new database, macro, table, or view |
DATABASE | Specifies a new default database for the current session |
DELETE | Removes rows from a table |
DELETE DATABASE | Removes all tables, views, and macros from a database |
DROP DATABASE | Drops the definition for an empty database from the Data Dictionary |
DROP TABLE | Removes a table from the database |
GIVE | Transfers ownership of a database to another user |
GRANT | Grants access privileges to a database object |
INSERT | Insert new rows to a table |
MODIFY DATABASE | Changes the options of an existing database |
RELEASE MLOAD | Removes the access locks from the target tables in the Teradata Database This inhibits any attempts to restart the MultiLoad job when a MultiLoad task has been suspended or aborted. Note: A RELEASE MLOAD statement must be entered from BTEQ. It cannot be entered from MultiLoad. |
RENAME | Changes the name of an existing table, view, or macro |
REPLACE MACRO REPLACE VIEW | Redefines an existing macro or view |
REVOKE | Rescinds access privileges to a database object |
SET QUERY_BAND | Identifies type and source of query to determine prioritization of queries |
SET SESSION COLLATION | Overrides the collation specification for the current session |
SET SESSION OVERRIDE REPLICATION ON/OFF | Turns replication services on or off |
UPDATE | Changes the column values of an existing row in a table |