Relational Database Design – Normalization

It’s common for people new to relational databases to approach the design of the tables like they would a flat, Excel-like spreadsheet: to design the tables as if they were isolated, with all the data needed for a particular functionality crammed into a single table.

This is not how relational databases are meant to be used. Tables in a relational database are linked together, forming a web of sorts, where all the data needed for a single query can be spread over a great many tables, and a change in one table can have an affect on every other table in that database.

It is therefore extremely important that each an every table in the database be properly designed, not only in regards to the data that each table is mean to store, but also in respect to every other table in the “web”. If you ever find yourself in need to expand your database, or refactor it in order to fix or improve your existing applications, a proper design can save you an enormous amount of time and effort.

Key concepts
Let me start by naming a few of the most important concepts in relational database design. A basic understanding of these will be required to understand the rest of the article.

  • Primary Key (PK)
    A column with a unique value for each row. Although not all database management systems (DBMS) require you to put a PK into each table, from a design perspective a PK is a requirement. No table should be without one.
  • Foreign Key (FK)
    These define relationships between tables. When you want a row in one table to be linked to a row in another table, you place a FK column in the child table and use the value of the parent row’s PK as the value of the FK field.
  • Composite Key
    This is a key that is made up of more than one column. This is typically used when you want to prevent a table from using the same combination of values twice. For example, in a table that lists item prizes for shops, you would only want each shop to have a single price for each item. So, you create a FK for the shop and a FK for the item, and then you create a composite PK out of those two columns. This would cause the DBMS to forcefully restrict entries that would create rows where the combined values of these fields are duplicated. – This type of key is commonly used in N:M relationships. (Explained below… with visual aids.)
  • One-To-One (1:1) relationship
    A relationship between two tables, where a single row in one table is linked to a single row in another table.

    +------------+     +----------------+
    | person     |     | person_contact |
    +------------+     +----------------+
    | person_id  |1---1| person_id      |
    | first_name |     | email          |
    | last_name  |     | phone          |
    +------------+     +----------------+
    

    This type of relationship is practically non-existent in normalized relational designs. They exist mostly to get around limitations in databases like Access, where the number of column was limited, thus creating the need to split tables up. They are also sometimes used to optimize the performance of the database.

  • One-To-Many (1:N) relationship
    A relationship between two tables, where multiple rows in a child table can be linked to a single row in a parent table. For example:

    +------------+     +------------+
    | person     |     | country    |
    +------------+     +------------+
    | person_id  |  |-1| country_id |
    | name       |  |  | name       |
    | country_id |*-|  +------------+
    +------------+
    

    This is in fact the only “real” type of relationship in a relational database. (See the next point for the reasoning behind that assertion.)

  • Many-To-Many (N:M) relationship
    A relationship between two tables, where multiple rows in one table can be linked to multiple rows in another table. This type is “artificial” in a a way, because this kind of relationship can not be created directly between tables. To accomplish this type of relationship you need to create a third table; an intermediary table that contains FKs to both parents, linked via a set of 1:N relationships.

    +-----------+     +--------------+     +--------------+
    | shop      |     | prices       |     | product      |
    +-----------+     +--------------+     +--------------+
    | shop_id   |1-|  | product_id   |*---1| product_id   |
    | shop_name |  |-*| shop_id      |     | product_name |
    +-----------+     | price        |     +--------------+
                      +--------------+
    

Normalization
To help us properly design our tables we have a set of guidelines which, if followed properly, will help reduce the redundancy and chance of data corruption. We call this “Normalization”.

There are several steps involved in normalizing a database. The steps are referred to as “Normal Forms” (abbreviated: NF). There are at least seven NF ranging from 1NF to 6NF. Each NF requires that the NF before it has also been satisfied. The spot between 3NF and 4NF is reserved for the BCNF (Boyce-Codd normal form), which was developed later as a slightly stronger version of the 3NF, to address certain shortcomings.

Tables that have reached the 3NF are generally considered “normalized”. Specifically aiming for a higher level is unusual, but a table that is designed to be in 3NF is very likely also in the 5NF.

The first three Normal Forms aren’t complex at all. They can take some getting used to, but after working with them for a while you may well find yourself developing 3NF compliant tables without even thinking about it.

The First Normal Form (1NF)
The first normal form is both the simplest and the most important of the three steps. It simply requires that tables must not contain repeating groups of data. This means that if you need to store multiple, identical pieces of data for a single entry (row) then you can not serialize them into a single field, or create multiple identical columns.

Consider the following example. It’s a simple list of persons, where each person is listed with his/her name and phone numbers.

+----+------+--------------------+
| id | name | phone              |
+----+------+--------------------+
|  1 | Joe  | 588-5522,789-85522 | 
|  2 | Anna | 589-4567,987-12354 |
+----+------+--------------------+

Note that both entries store multiple phone numbers in a single field, separated by a comma. There are two major problems with this approach:

  • Your database management system (DBMS) regards each field as a single value, so it can not differentiate between the individual phone numbers. From your DBMS perspective, the phone field is just a normal string containing a single value. – This complicates the use of that data, especially when you need to do things like search for or extract a single value, or filter a data-set based on a single value in the field.
  • It also means you need to manually manage the data; to write the code that separates the values when they are retrieved and constructs the string when they are inserted. This makes it FAR more likely that your data will become corrupt or incompatible between applications. The DBMS has NO control over how the data is stored. It simply considers it a single string, and stores it as such. The internal structure and integrity of the data is completely up to you.

To remedy the situation, you would need to separate the data into individual fields. Your first instinct might be to simply create multiple “phone” columns, like so:

+----+------+----------+-----------+
| id | name | phone1   | phone2    |
+----+------+----------+-----------+
|  1 | Joe  | 588-5522 | 789-85522 |
|  2 | Anna | 589-4567 | 987-12354 |
+----+------+----------+-----------+

But this is NOT an acceptable solution. It does solve both of the problems I listed above, but it creates a new problem.

Namely that now we have restricted each person to two phone numbers and ONLY two phone numbers. What if a person needs to store three numbers? Using this table, the only way would be to add a third “phone” column, which would also add a third number to ALL other persons in the table. (What if we need to store a hundred phone numbers? Or a thousand?)

What we want to do is allow each person to supply as many phones as that person needs. To allow for that, we need to extract the phone numbers from that table altogether and put it into a new table dedicated to listing phone numbers. The design for that may look something like this:

+-----------+     +--------------+
| person    |     | phone_number |
+-----------+     +--------------+
| person_id |1-|  | phone_id     |
| name      |  |-*| person_id    |
+-----------+     | number       |
                  +--------------+

There each row in the phone_number table contains a column with the ID of a person. This column identifies the person who’s number this is. A relationship like this is referred to as a One-To-Many (1:N) relationship, because each row in the parent table (person) can be linked to multiple rows in the child table (phone_number), but not the other way around.

The person_id column in the phone_number table is what we call a Foreign Key. It is an indication that the value in the column is meant to reference the value of another table. In many cases the DBMS will enforce this link and reject phone_number entries that do not provide a person_id that exists in the person table.

The Second Normal Form (2NF)
This requires that no field should only be partially dependent on any candidate key in the table. This does not only include the PK, but any fields combinations that would uniquely identify a row.

Consider this design:

+---------------+
| prices        |
+---------------+
| price_id (PK) |
| product       |
| shop          |
| unit_price    |
| qty           |
| shop_address  |
| unit_weight   |
+---------------+

The price_id column is the PK there, but because the combined values of the product and shop columns could also act as a composite PK, together they are considered a “candidate key”.

Lets look at a few example entries into that table.

+----------+---------+--------+---------+------+--------------+------------+
| price_id | product | shop   | u_price | qty  | shop_address |unit_weight |
+----------+---------+--------+---------+------+--------------+------------+
|        1 | Beer    | Bob's  |    9.50 | 12.0 | Main Road 5  |      15.00 |
|        2 | Pepper  | Bob's  |   19.50 |  2.5 | Side Road 10 |       2.00 |
|        3 | Beer    | Jill's |    3.50 |  6.0 | Main Steet 1 |       1.50 |
|        4 | Pepper  | Jill's |    8.50 | 30.0 | Main Road 1  |      20.00 |
|        5 | Salt    | Jill's |   27.50 | 3.14 | Main Road 10 |     250.00 |
+---------+---------+--------+---------+------+--------------+-------------+

The problem becomes apparent when we examine the values of the shop_address and unit_weight fields against the above mentioned candidate key. The shop_address values should be identical in all rows with the same shop value, and the unit_weight should be identical for the same product values. However, this design allows us to specify different shop addresses and unit weights for the same shops and products, which in reality makes no sense.

So, to turn this into a 2NF design, you move these fields out of the table into their own tables, creating FKs to link the new tables to the main sale table.

+-----------+     +--------------+     +--------------+
| shop      |     | prices       |     | product      |
+-----------+     +--------------+     +--------------+
| name (PK) |1-|  | price_id     |  |-1| name (PK)    |
| address   |  |  | product      |*-|  | weight       |
+-----------+  |-*| shop         |     +--------------+
                  | unit_price   |
                  | qty          |
                  +--------------+

This ensures that all prices entries will be linked to a proper address and weight values.

The Third Normal Form (3NF)
The last of the forms needed for a database to be considered normalized. It requires that columns should depend only upon the primary key of the table. Basically what that means is that any column that is not solely dependent on the primary key of this table, or only partially, should be moved out of the table.

Lets look at two examples:

First, if we add a “country” column to the “persons” table we created for the 1NF.

+-----------+------+----------+
| person_id | name | country  |
+-----------+------+----------+
|         1 | Joe  | France   |
|         2 | Anna | England  |
+-----------+------+----------+

This looks fine at first glance, but if you look closer, the name of the country does not belong in this table.

Consider what happens if you are asked to list ALL possible countries. (E.g. for a drop-down box on a website.) The problem with that is, because the countries are dependent on a person to exist, the design will only allow us to list countries of people that exist in the database. No country that does not have a representative in the person table can exist in the database.

The solution for this is to move the country out of the table, into it’s own table, and add a Foreign Key to the person table that references the country the person belongs to:

+------------+     +------------+
| person     |     | country    |
+------------+     +------------+
| person_id  |  |-1| country_id |
| name       |  |  | name       |
| country_id |*-|  +------------+
+------------+

Now you can list all the countries in the country table, and just link the persons to their respective countries.

Second, to address another common 3NF conformance issue, if we wanted to add more than just the country name for each person. If we were to add the city and address info as well, it might looks something like:

+-----------+------+----------+--------+------------------+
| person_id | name | country  | city   | address          |
+-----------+------+----------+--------+------------------+
|         1 | Joe  | France   | Paris  | Eiffel blwd. 101 |
|         2 | Anna | England  | London | English road 302 |
+-----------+------+----------+--------+------------------+

This is basically the same problem we had with the previous example, but we have added two more columns, neither of which belong in this table either. So, lets try to apply the same solution as last time, but this time move all three location items into a location table, and link the location to the person:

+-------------+     +-------------+
| person      |     | location    |
+-------------+     +-------------+
| person_id   |  |-1| location_id |
| name        |  |  | country     |
| location_id |*-|  | city        |
+-------------+     | address     |
                    +-------------+

This is better, but can you spot the problem?

As you remember, to reach 3NF no field can depend on anything but the PK of the table. But if you look at the location table, notice that “address” depends not only on the PK, but on the “city” field as well. (If the city field changes, so must the address field). The same applies to the city field; it depends on the country field.

This means that, in order to reach 3NF, you need to move country, city and address all into their own tables, each linked to each other. The address table should be linked to the person, the city table linked to the address table, and finally the country table linked to the city table.

+-------------+     +------------+
| person      |     | address    |
+-------------+     +------------+
| person_id   |  |-1| address_id |
| location_id |*-|  | city_id    |*-|
| name        |     | name       |  |
+-------------+     +------------+  |
                                    |
+------------+     +------------+   |
| country    |     | city       |   |
+------------+     +------------+   |
| country_id |1-|  | city_id    |1--|
| name       |  |-*| country_id |
+------------+     | name       |
                   +------------+

Now this design is in 3NF, and could be considered “normalized”.

Epilogue
It’s should be mentioned that in some cases, putting your tables into 3NF can have a negative effect on the performance and usability of the table. Like, for example, in the latest example. Before normalizing the table, you could query all the data from the single table with a simple SELECT query. After normalizing it, fetching the exact same result requires three JOINs.

My point is this; normalization is a design goal. It may not always be the practical solution. It may in fact be impractical in certain situations to design fully normalized tables. If performance and resource conservation matters a great deal more to you than data integrity, then you may well be better of aiming for a lower NF level. I would advice you to ALWAYS make sure you tables are in 1NF, but beyond that you must judge the situation for yourself.

I can only say that in 99% of cases, you are be better of at least trying to go for the 3NF. It may spare you some otherwise wasted efforts later on.

In any case, I hope this has been helpful.

Advertisements

How Connect to Oracle when you forgot your Password

Reset Oracle Password

 
Step1: Open SQL Command Line  and enter the following command:

SQL> conn sys/manager as sysdba;
Connected.

Step2: To reset the password of the SYSTEM password (or any other user password), run the following query:

SQL> alter user sys identified by manager;
User altered.

Or

Your password file should be under <orahome>\database\PWD<SID>.ora.

Delete it and run the Oracle password utility from the command prompt:

c\:Oracle\ora92\database>ORAPWD file=PWD<SID>.ora password={password} entries={however many}.

The <password> is your new sys password. After you log in as sys you can change it and create new passwords for system.

What is Metadata ?

Data about data. Metadata describes how and when and by whom a particular set of data was collected, and how the data is formatted. Metadata is essential for understanding information stored in DATA WAREHOUSES and has become increasingly important in XML-based Web applications.

Metadata is data that describes other data. Meta is a prefix that in most information technology usages means “an underlying definition or description.”

Metadata summarizes basic information about data, which can make finding and working with particular instances of data easier. For example, author, date created and date modified and file size are examples of very basic document metadata.  Having the abilty to filter through that metadata makes it much easier for someone to locate a specific document.

In addition to document files, metadata is used for images, videos, spreadsheets and web pages. The use of metadata on web pages can be very important. Metadata for web pages contain descriptions of the page’s contents, as well as keywords linked to the content. These are usually expressed in the form of metatags. The metadata containing the web page’s description and summary is often displayed in search results by search engines, making its accuracy and details very important since it can determine whether a user decides to visit the site or not. Metatags are often evaluated by search engines to help decide a web page’s relevance, and were used as the key factor in determining position in a search until the late 1990s. The increase in search engine optimization (SEO) towards the end of the 1990s led to many websites “keyword stuffing” their metadata to trick search engines, making their websites seem more relevant than others. Since then search engines have reduced their reliance on metatags, though they are still factored in when indexing pages. Many search engines also try to halt web pages’ ability to thwart their system by regularly changing their criteria for rankings, with Google being notorious for frequently changing their highly-undisclosed ranking ALGORITHMS.

Metadata can be created manually, or by automated information processing. Manual creation tends to be more accurate, allowing the user to input any information they feel is relevant or needed to help describe the file. Automated metadata creation can be much more elementary, usually only displaying information such as file size, file extension, when the file was created and who created the file.

Oracle Enterprise Manager (OEM)

Oracle Enterprise Manager is Oracle’s GUI-based database management tool. It is possible to manage the oracle environment using SQL and PL/SQL but OEM gives you a console based framework to help you with almost all of your day-to-day management activities, including tasks such as backup, recovery, export, import and data loading, it also makes it easier to use the oracle advisors.

There are two version of OEM – a single instance version called Database Control and a system wide version that lets you manage all your systems called Grid Control. Most dba’s do not like using GUI’s but it makes sense to use the GUI for the easier day-to-day administration tasks like adding users, resizing tablespaces, etc, I think that knowing both SQL and PL/SQL and the GUI will make you a better dba in the long run.

The benefits of using the OEM

Out-of-the-box Management Installing and configuration of the OEM (both database and grid control) is a breeze, if using the DBCA to install Oracle it will automatically be installed for you, configuring the OEM is a trivial task.
Web-based Management OEM can be configured to use both HTTP or HTTPS, so security can be tightened if required. Being web based all you need is web browser to access the OEM.
Real-Time Monitoring Normally you have to write scripts or use a 3rd party tool to perform real-time monitoring, which can cause additional expensive or time, OEM will perform real-time monitoring right from the off. The new hang analysis feature in 10.2 helps during a database hang, when SQL scripts can’t get information back to you quickly, by attaching directly to the SGA bypasses SQL and helps you out when normal SQL is used.
Complete Environment Monitoring A poor performing database could be the result of a problem anywhere in the application stack (web server, app server, etc). OEM can monitor the performance of all the components in the stack not just the Oracle database. It can monitor

  • The entire Platform
  • End user experience
  • Systems and web applications availability
  • The extended network components
  • Business transactions
  • Historical data analysis and change tracking
Application Performance Monitoring Oracle uses alert and notification systems to monitor the database, these alerts can identify problems quickly thus using a pro-active approach, thresholds can also be easily adjusted with the OEM. It can identify

  • Monitors the performance and availability of the system
  • Indicates outages and bottlenecks throughout the application system
  • Analyzes the root cause of performance with drill-downs
  • Minimizes application downtime through the use of efficient problem-resolution tools.
Scaleable Management Capability OEM is highly scalable, the only requirement is that any new servers require the management agent started.
Consolidated Database Management The home page of OEM will contain everything you need for a quick overview of your environment, this means that any problems can be identified very quick, a few more clicks and you have advanced performance charts available.
Integrated System Management OEM uses the Smart Plug-In (SPI) to provide integration in system-wide monitoring tools such as HP Openview.

OEM Database Control

You can automatically install OEM with installing Oracle via the DBCA, just select the use database control for database management option.

Manually configure OEM unix: export ORACLE_SID=D01
windows: set ORACLE_SID=D01

emca -config dbcontrol db

Start dbconsole emctl start dbconsole
Stop dbconsole emctl stop dbconsole
dbconsole status emctl status dbconsole
Start management agent emctl start agent
Stop management agent emctl stop agent
Management agent status emctl status agent

Main OEM Page

The main OEM page is displayed below, please refer to the oracle web site on how to navigate and what is available on the OEM.

OEM Grid Control

OEM Grid Control can manage a number of systems across the environment, this means easier management. The Grid Control has a number of components

  • Grid Control – the main OEM web-based interface for monitoring the enterprise
  • Management Agent – monitors the host and sends information to the OEM management Service
  • Management Service – This is the middle tier (OMS) and stores the data collected in a repository, which is used by the Grid Control

Note that the grid control product is a licensed product which you can obtain from Oracle.

If possible it is best to install the Grid Control on a server on its own, depending on the number of servers you wish to monitor will determine the performance and disk space required for the server.

There are a number of commands that can be used to control the Oracle Management Services (OMS)

Status # emctl status oms
Start # emctl start oms
Stop # emctl stop oms

To install Grid Control follow below

Grid Control Installation
  1. Make sure that the Oracle CD is in the CD Drive and run the following

    ./runInstaller

  2. At the Welcome screen click Next
  3. Accept the default Operating System group Name, click Next
  4. You now need to run a script called orainstRoot.sh as user root in another terminal session.

    # cd $ORACLE_BASED/oraInventory
    # ./orainstRoot.sh

  5. Now click continue back in the Installer window
  6. Now choose the directory for the OEM files and click Next
  7. If using a new server select the option that creates a new Oracle database, click Next
  8. It will now perform prerequisite check, click Next
  9. Choose a password for SYSMAN and click Next
  10. Select password for SYS, SYSTEM AND DBSMNP, click next
  11. You can alter the Metalink and Proxy information if so desired, click Next
  12. Choose the SID and the Global database name for the database, click next
  13. Specify the location for all the database files, click Next
  14. Review the components and then click Next
  15. You will now watch while Grid Control is installed, you will be asked to run root.sh as user root
  16. A window will be displayed showing the various configuratrion assistants, click OK
  17. yet again run the orainstRoot.sh script as user root, then click Next
  18. This screen is important as it details all port numbers and links to the web page (make note)
  19. Click Exit to end the installation

To install the OEM Management agent only

Management agent Installation   1-7. Use the proceeding first seven steps

  1. In the select a product window, choose the additional management agent installation type and click Next
  2. specify the hostname and click Next
  3. Click Next after reading the security page
  4. Review the summary and click Next to start the install
  5. Click Exit to finish the install
  6. In Grid Control click on the target tab, you see all the discovered hosts

What is a Kindle?

A Kindle is a small hand-held electronic device for reading books, which has been developed by online retailer Amazon Rather as you download an iPod or MP3 player with music, you download books (via wireless technology) on to a Kindle and read them on it. The latest model is the Paperwhite, which has a touchscreen, and is available from December 2012.

Launched in early 2012 was the Kindle Fire, and latterly the Kindle Fire HD, with more functionality than previous Kindles and similar to a tablet device, based on an operating system called Android. It has a colour screen and offers basic features such as the ability to stream video and music as well as continue to be an e-reader.

You buy these books as Kindle versions of ‘e-books’ from the Amazon website. In addition, you can also download PDF of any sort of document, which makes this an ideal way of reading for work and study.

Even though the device is extremely lightweight and portable, you can keep a huge number of books on the Kindle. It has a six-inch screen to read from, plus a small keyboard that lets you perform internet searches and other relatively simple web-oriented activities.

Most people use a wifi connection to download books on to their Kindle. However, newer, more expensive models come with 3G mobile phone technology, which makes it possible to download books no matter where you are. With both versions, you can also download books to a PC or Mac and then transfer them to the Kindle via a USB cable.

Key benefits and features of a Kindle:

  • Stores up to 1,400 books.
  • Special high-contrast screen allows you to read even in bright sunshine with no glare.
  • Clear text and fonts, and a sharp display – what you see resembles a book page.
  • Adjustable text size.
  • Built-in wifi or 3G connection.
  • Battery life of at least a month if wifi is turned off.
  • Page-turning function so you feel as if you’re reading a real book.
  • Integrates with Facebook and Twitter and allows searching on Wikipedia.
  • Never heats up like a laptop.
  • Books can be categorised or stored as collections.
  • Automatic archive function: you can delete books and download them another time.
  • Screen can be rotated for better viewing of pictures, diagrams, maps, etc.
  • Also able to read newspapers, magazines and web pages.
  • In-built keyboard enables searching within a book, a library or online.
  • You can add your own annotations to the text.
  • Automatic bookmark.
  • Built-in Oxford English Dictionary.
  • Password protection.

Move datafile to NEW Location

SQL> select * from v$version;

BANNER

——————————————————————————–

Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 – 64bit Production

PL/SQL Release 11.2.0.2.0 – Production

CORE    11.2.0.2.0      Production

TNS for Linux: Version 11.2.0.2.0 – Production

NLSRTL Version 11.2.0.2.0 – Production

SQL>

SQL> select name from v$datafile;

NAME

——————————————————————————–

/opt/oracle/data/DEV3/system01.dbf

/opt/oracle/data/DEV3/sysaux01.dbf

/opt/oracle/data/TEST01.dbf

Here my TEST tablespace datafile is not in DEV3 location, so I wanted to move into /DEV location.

Here I’m testing in my DEV database but make sure no one using this tablespace if this is doing in prod database J

1)      Make the corresponding Tablespace OFFLINE

        SQL> ALTER tablespace TEST offline;

Tablespace altered.

2)      Update the data dictionary pointing to new location using below command:

         SQL> ALTER DATABASE RENAME FILE  ‘/opt/oracle/data/TEST01.dbf’  to    ‘/opt/oracle/data/DEV3/TEST01.dbf’;

Database altered.

3)      Now, recover the datafile from new location:

         SQL> RECOVER DATAFILE ‘/opt/oracle/data/DEV3/TEST01.dbf’;

Media recovery complete.

4)      Bring back the TEST tablespace ONLINE

         SQL>  ALTER tablespace TEST online;

Tablespace altered.

SQL> select name from v$datafile;

NAME

——————————————————————————–

/opt/oracle/data/DEV3/system01.dbf

/opt/oracle/data/DEV3/sysaux01.dbf

/opt/oracle/data/DEV3/TEST01.dbf

Now my TEST datafile moved to the new location.

Add database to 11g OEM Grid control

Here I have a 11g Grid control running on Linux x86_64.

To add a database in the Grid control, Open OEM grid control page –> targets –> databases

Click on ADD button, then the below screenshot will appear

Select the hostname from the list shown and click continue.

NOTE : The hostnames are displayed when that server is configured with the GRID server, if not configure.

It will take some time to find and display the targets and the  listeners found on that host(server).

Select the target you want to add and the listener associated with it. Then click on configure button for the selected target and you’ll get below screen.

Give ‘dbsnmp’ password and fill all the fields correctly then click next and ok.

Done, now you’ll see your target(database) in the OEM grid database list. Njoy !!!!!!

spool command

What is SPOOL ?
Spool Command in ORACLE is used to transfer data from oracle tables into other files.

How to SPOOL from ORACLE in CSV format ??

Set echo off;

Set Heading off;
Set define Off;
Set feedback Off;
set verify off;
Set serveroutput On;
SET PAGESIZE 5000
SET LINESIZE 120
SQL >   Spool c:\file.csv     (Windows)
SQL >  SELECT EMPNAME||’,’||EMPID||’,’||EMPSAL FROM EMPLOYEE;

SQL>  Spool Off;
Set define On;
Set feedback On;

Set heading on;
Set verify on;

Ex:  Recently i written a spool command for making all the tables and indexes max extent sizes to unlimited because lot of tables and indexes max extent size have NULL value

Set echo off;

Set Heading off;
Set define Off;
Set feedback Off;
Set verify off;
Set serveroutput On;
SET PAGESIZE  5000
SET LINESIZE 120

SQL>   Spool   extent.sql

SQL>   select   ‘alter ‘||   object_type||’  ‘||object_name||’   ‘||’ storage (maxextents unlimited);’

from  dba_objects   where   object_type in (‘TABLE’,’INDEX’)   and owner = ‘XXX’;

spool off

SQL> @extent.sql                       (for executing spool command)

If u didn’t specify anything after the file name(ex: extent  instead of extent.sql) then oracle by default generates output file as ‘.log’ extention(i.e., extent.log)

If we have very few tables in the database instead of writing spool command we can do manually one after another using

SQL >   alter  table  tab_name  move storage (maxextents unlimited);
Table altered.

Or

SQL>   alter  index  ind_name  move  storage (maxextents unlimited);

Index altered.
Using single command we can write dynamic sql script to do the same changes for all the objects

NOTE:

In Linux the output can be seen in the Directory from where you entered into the SQLPLUS

In Windows the output file is located where you specified in the spool

High Water Mark (HWM)

Oracle uses the high water mark to identify the highest amount of space used by a particular segment. It acts as the boundary between used and unused space. As the amount of data grows due to row inserts and updates, the segment’s high water mark grows accordingly. But when you delete the Rows then HWM cannot come down automatically, extra steps needs to done to bring down HWM.

The HIGH WATER MARK is set at the beginning of the segment when the table is created. Deleting rows will never reset the HWM even if all the rows are deleted from the table.

To reduce the high water mark:
a. Take export of table and re-import back
b. Truncate the table (only if you want to remove data/rows completely)

Suppose we create an empty table, the high-water mark would be at the beginning of the table segment

X Unused space

HWM

When you insert rows into the table the high watermark will be bumped up step by step. This is done by the server process which makes the inserts.

Used DATA/Rows X Un used Space

                                                               HWM

Now if we insert a table with 10,000 rows. And let’s assume that we deleted 5,000 rows later.

Used data Empty blocks X Un used space

<—————————————————————>HWM

Full table scan

As you seen above by deleting the data, HWM does not move.  The main disadvantage of this is that oracle always read the blocks up to high water mark in case of full table scan.  You may have ever notice that doing a count (*) on empty table, takes time to show you 0 rows.  The reason for delay is setting of HWM at higher position.

NOTE:  Whenever optimizer takes full table scan, it scans all the blocks below HWM. This would degrade the performance and slowdown the Full table scan process. To avoid this, we need to shrink the table to reset the HWM.

So, now how we set the high-water mark at lower position?

The only way to set the HWM is to truncate a table. Let us see how truncate set the HWM.

No data in the segment


HWM

HWM is reset now, after truncating data.

HWMBLOCKS = “Blocks Containing data” + “Empty blocks”

HWMBLOCKS – “Blocks containing data”

HWM % = (———————————-)  * 100

HWMBLOCKS

You can also use DBMS_SPACE.UNUSED_SPACE procedure to determine HWM.

High Watermark = Total blocks – Unused blocks – 1

SQL> select blocks from dba_segments where owner=upper(‘HR’) and segment_name = upper(‘JOBS’);

SQL> analyze table hr.jobs estimate statistics;

SQL> select empty_blocks from dba_tables where owner=upper(‘HR’) and table_name = upper(‘jobs’);

Ways for Resetting HWM………..

Option 1
Alter table  tab_name  move tablespace

This option requires rebuilding of indexes. Since the index will be invalid/unusable after running the above command. Also users cannot use the application or reports while rebuilding the index. Make sure you stop Goldengate if this table is in extract process otherwise it will go ABEND (because of Unusable indexes)

Option 2

  1. Export the data
    2. Truncate the table
    3. Import the table
    4. Analyze the table

Option 3
  1. Copy the table data
2. Truncate the original table
3. Insert back.

    Option 4
    Use DBMS_REDEFINITION package to copy the table (redefining the Table structure)

People confused about setting of HWM through ALTER TABLE DEALLOCATE UNUSED clause. This clause only frees unused space above the high water mark but cannot reset HWM position.

To determine the exact number of blocks that contain data i.e. space used by table below the high water mark, query rowid and get the first and last used block from the rowid.

SQL> select count (distinct dbms_rowid.rowid_block_number(rowid)) “used blocks” from table_name;
Starting with ORACLE 10gR1 we can use a new feature for adjusting the high watermark,
it is called segment shrinking and is only possible for segments which use ASSM, in other words, which are located in tablespaces which use Automatic Segement Space Management.

How to View High Water Mark – Step-by-Step Instructions (Doc ID 262353.1)

Finding current SCN of a database

There are two ways to get the current SCN (System Change Number) for an oracle 10g and 11g database

DBMS_FLASHBACK.GET_SYSTEM_CHANGE_NUMBER and V$database

Method 1:  Using DBMS_FLASHBACK.GET_SYSTEM_CHANGE_NUMBER function

SQL>   SELECT TO_CHAR(dbms_flashback.get_system_change_number) FROM dual;

GET_SYSTEM_CHANGE_NUMBER

—————————————-

11626841778005

Method 2 :    Using current_scn column from v$database;

SQL>   SELECT TO_CHAR(CURRENT_SCN) FROM V$DATABASE;

CURRENT_SCN

——————–

11626841778008