Thursday 13 June 2013

Oracle SQL Loader Utility


Abstract
The document is aimed at describing the procedure to be followed while working with SQL Loader Utility. Using the tool, we can load the external data coming in various formats like excel sheet, flat text file, comma Separated Variable (.csv) files and so on.
Case History
Database Administrators may come across some situation, where they need to load the external data coming from other sources into oracle tables. Before loading the data, they can restrict/limit/skip the loading with the usage of Control file (.ctl)

Analysis
External data will be of two types like Fixed Data and Variable Data and depending upon the type of the data, we need to generate the creation of the control file which in turn will convert the utility tool functioned as per the requirement and ensure that only the qualified or eligible data is loaded into the the Tables.

Introduction: As discussed above, the data source will be in the form of various types like .csv and  flat text files and hence the loading of the data into the tables has been classified into types namely Fixed Width and Varied data.

Terminology:
1)    Control File: Control file describes the actions to be done by the sql loader and we can use any text editor to writing a control file. Usually this will be followed by .ctl file extension.
2)    In File: In file or input file is the name of the data source file like .csv or .txt which will be given as a input for the control file to look for the data.
3)    Bad file: Bad file is the file which records the rows which got failed during the loading operation and the cause for the failure. This is a Optional file to provide while executing the sql loader at the command line
4)    Discard File: Discard file is again an optional file, which will store the rows that are not qualified during the load operation, when a condition like where orwhen has been specified as a part of the control file.
5)    Loading Options:  INSERT,  APPEND, REPLACE and TRUNCATE

Insert           :  Loads rows only if the target table is empty.
Append         :  Load rows if the target table is empty or not.
Replace         :  First deletes all the rows in the table and then load rows
Truncate        :  First truncates the table and then load rows.

Varied Data Format ( .csv format or .txt files)
Example:

“10”,”James”,”1000”,”Finance”
“20”,”Stuart”,”2000”,”Design”
“30”,”Mclean”,”3000”,”Media”
“40”,”Sandra”,”8000”,”Architect”
“50”,”Natalie”,”9000”,”Solutions”

Now we have to generate a control file to describe the loading actions for sqlloader utility to load the data into the table EMP as illustrated below:

load data
infile ‘C:\mine\emp_data.csv’
insert into table emp
fields terminated by “,” optionally enclosed by  `” ` TRAILING NULLCOLS
( table_column1,
 Table_column2,
 Table_column3,
 Table_column4
)

Note: TRAILING NULLCOLS means if the last column is null then treat this as null value, otherwise,  SQL LOADER will treat the record as bad if the last column is null.

After you have written the control file save it and call SQL Loader utility by typing the following command

$ sqlldr userid= scott/tiger control=emp.ctl  log=emp_load.log
                
After you have executed the above command SQL Loader will shows you the output describing how many rows it has loaded.

The LOG option of sql loader specifies where the log file of this sql loader session should be created.  The log file contains all actions which SQL loader has performed i.e. how many rows were loaded, how many were rejected and how much time is taken to load the rows and etc. You have to view this file for any errors encountered while running SQL Loader.

Fixed Data Format ( .dat file)
Example:
10 James   1000   Finance
20 Stuart  2000   Design
30 Mclean  3000  Media
10 Sandra  8000  Architect
20 Natalie  9000  Solutions
SOLUTION:
Steps :-
First Open the file in a text editor and count the length of fields, for above example, employee number is from 1st position to 2nd position, employee name is from 4thposition to 10th position, Salary column is from 11th position to 15th position and finally Department column from 16th postion to  21st Position.

load data
infile ‘C:\mine\emp_data.dat’
insert into table emp
(
   table_column1       position(01:02)         integer external,
   table_column2       position(03:10)         char,
   table_column3       position(11:15)         integer external
   table_column4       position(16:21)         char
);
Note:-  The datatypes (INTEGER EXTERNAL, CHAR, DECIMAL EXTERNAL) identify the datatype of data fields in the file, not of corresponding columns in the table.

After writing the control file save it and initiate the SQL Loader utility by typing the following command
$ sqlldr userid= scott/tiger control=emp.ctl  log=emp_load.log

We can simultaneously load data into multiple tables within the same session and using WHEN condition to load only specified rows which meets a particular condition (only equal to “=” and not equal to “<>” conditions are allowed).

Considering the above mentioned data, our requirement is to load the rows having department number equals =10 into one table and  department number not equal to <> 10  to go into another table.
load data
infile ‘C:\mine\emp_data.dat’
append into table scott.emp1
when (dept_no=’ 10’)
(
   table_column1       position(01:02)         integer external,
   table_column2       position(03:10)         char,
   table_column3       position(11:15)         integer external
   table_column4       position(16:21)         char
)
Into table scott.emp2
when (dept_no<>’ 10 ’)
(
   table_column1       position(01:02)         integer external,
   table_column2       position(03:10)         char,
   table_column3       position(11:15)         integer external
   table_column4       position(16:21)         char
);

Note:-          SQL Loader help can be obtained by  typing the command
                                             $sqlldr help=Y
userid          ORACLE username/password
control         Control file name
log              Log file name
bad             Bad file name
data            Data file name
discard         Discard file name
discardmax    Number of discards to allow
skip              Number of logical records to skip
load              Number of logical records to load
errors           Number of errors to allow
bindsize        Size of conventional path bind array in bytes
silent            Suppress messages during run direct, use direct path
parfile           parameter file: name of file that contains parameter specifications
parallel          do parallel load
file               File to allocate extents from
readsize         Size of Read buffer

No comments: