Monday, March 7, 2011

what is an Enterprise Data Model (EDM)?

An enterprise data model (EDM) contains entities, attributes, and relationships that represent the information needs of the entire enterprise.

what is Erwin Data Modeler?

ERwin is a visual data modeling tool that supports logical and physical data modeling.

ERwin is a graphical data modeling tool that allows a modeler to use information requirements and business rules to create a logical and physical data model

defien data warehouse?

A data warehouse is a decisional database system.

It is designed to support the decision makers in the organization in ways a transactional processing system is ill-equipped to handle, such as the strategic-level goals and visions of an organization.

what is oracle warehouse builder?

The Oracle Warehouse Builder (OWB) is a tool provided by Oracle that can be used at every stage of the implementation of a data warehouse right from the initial design and creation of the table structure to ETL and data-quality auditing.

Wednesday, March 2, 2011

Permanet Journals work along with Backup to make it useful?

True

Permanent Journal tables work along with Archive options.

When DBA archives or backup Permanent journal its current
contents will be deleted

We create a journal / journal table at creation of table?

False.

Journal or Journal table is created at creation time of database
or users. we can also create these using MODIFY statement of
database or user.

At table creation time we just specify
1. Enable or disable journaling for that table
2. which journaling table to be used
3. what type of journaling to be used(Before / After/ Dual Before / Dual After)

what is a permanent journal and explain different types?

A Permanent Journal captures a picture of any row that changes. This would include UPDATES, INSERTS, and DELETES. Permanent Journals do not double the data or PERM for a table, but only doubles the rows that change.

i.e
The Permanent Journal stores only images of rows that have been changed due to an INSERT, UPDATE, or DELETE command.

There are four image options for the Permanent Journal:

· Before Journal

· After Journal

· Dual Before Journal

· Dual After Journal

Create a table with Fall back ,Before Journal and dual after journal features?

create table student ,
fallback,
before journal,
dual after journal
(
sno integer,
sname varchar(30),
sfee decimal(10,2)
)
unique primary index(sno);

what are the different image options for Permanent Journal?

There are four image options for the Permanent Journal:

· Before Journal

· After Journal

· Dual Before Journal

· Dual After Journal

Permanent Journals are automatically purged in teradata. True / False?

False.

The Permanent Journal must be manually purged from time to time.

which objects require Perm space in teradata?

Tables and Stored Procedures objects require Perm Space

Views, Macros , Triggers doesn't require Perm space.

Does Permanent Journals and Secondary indexes require Perm Space?

Ans: Yes

useful URLs?

http://www.coffingdw.com/Teradata_Basics/teradata_basics.htm#chapter_9___advanced_topics_that_you_will_be_tested_on/teradata_cabinets_nodes_vprocs_and_disks.htm

Thursday, February 10, 2011

can you list some metrics used in Physical model/ELDM?

Join Access Frequency

Value Access Frequency

Change Rating

Distinct Rows

Max rows NULL, MAX Rows per Value

How do you choose Primary Index for a table in teradata?

A great Primary Index will have:

A Value Access frequency that is high
A Join Access frequency that is high
Reasonable distribution (i.e unique)
A change rating below 2

what is ELDM (Extended Logical Data Modeling) ?

ELDM is very important in terdata data modeling. It comes after Logical modeling
and before physical modeling of database.

1. ELDM is very important and needs extra care. otherwise wrong inputs will go Physical modeling
2.The ELDM will become our final input into the Physical Data Model.
3. In developing Any teradata Physical database, we go through
Logical Design
ELDM
Physical Design
4.The Logical Data Model will be the input to the Extended Logical Data Model.
The Extended Logical Data Model will be input to the Physical Data Model.

5.demographics of data is very important. which we get by questioning logical data modelers

6. Logical data Modeling:
The LDM is a logical representation of tables that reside in the data warehouse database
Once the tables are defined, related, and normalized, the Logical Data Model is complete. Teradata recommends third normal form tables. we can also limit to 2NF

The completed Logical Data Model is used as a basis for the development of the Extended Logical Data Model (EDLM).

7.The ELDM includes information regarding physical column access and data demographics for each table. This also serves as input to the implementation of the Physical Data Model. The ELDM contains demographic information about data distribution, sizing, and access.

The two major components of an ELDM are the following:

COLUMN ACCESS in the WHERE Clause or JOIN CLAUSE.

DATA DEMOGRAPHICS.

8.Physical Database Design portion where we choose our Primary Indexes and Secondary Indexes.

Wednesday, February 9, 2011

when you may get 10065 WSA E host unreach?






This is mainly because of Firewall on your computer

and we may see this message
C:\Documents and Settings\suresh>pdestate -a
PDE state is RUN/STARTED.
DBS state is 7: System is operational without PEs - Sessions are not allowed

After turning off firewall, if you are still having issue then we may to
bring PE to ONLINE using MULTITOOL.
Once logon to Multitool you may see the following things:

Then we need to set PE state using Tools --> Vproc Manager

Then Vproc Manager opens. then select tab Set Vproc State.


Then set offline PE to online... Now we see





C:\Documents and Settings\suresh>pdestate -a
PDE state is RUN/STARTED.
DBS state is 4: Logons are enabled - Users are logged on

How do you compress tables in teradata?

While creating table we use compress keyword in CREATE TABLE
statement,

we can't compress the following fields

1. Primary Index column
2. Not NULL column

what is a Queue Table?

A queue table is a persistent database table with the properties of an asynchronous
first-in-first-out (FIFO) queue

The queue table is different from a standard base table in that a queue table always contains a user-defined Queue Insertion Timestamp (QITS) as the first column of the table.

Can you expland BYNET?

BYNET Banyan Network (high-speed interconnect)

what is EBCDIC and FIPS ?

EBCDIC Extended Binary Coded Decimal Interchange Code
FIPS Federal Information Processing Standards

what is TDGSS in teradata?

TDGSS : Teradata Generic Security Services

what is RSS in teradata?

Resource Sampling Subsystem (RSS) helps to collects resource usage
data in teradata

what is Normalization?

Normalization is the process of reducing a complex data structure into a simple, stable one. Generally this process involves removing redundant attributes, keys, and relationships from the conceptual data model.

How teradata is actively available?

Teradata “single system” availability through its support for
large cliques,
hot standby nodes,
and fallback

waht is TASM in Teradata?

TASM: Teradata Active System Management
Teradata ASM) is a portfolio of products that enables the real-time system management required for delivering Active Enterprise Intelligence

Teradata ASM assists the database administrator in analyzing and establishing workloads and resource allocation to meet business needs. Teradata ASM facilitates monitoring workload requests to ensure that resources are used efficiently and that dynamic workloads are prioritized automatically.

Teradata ASM also provides state-of-the-art techniques to visualize the current operational environment and to analyze long-term trends. Teradata ASM enables database administrators to set SLGs, to monitor adherence to them, and to take any necessary steps to reallocate resources to meet business objectives.

what teradata supports 6A's means?

Active Load
Active Access
Active Events
Active Workload Management
Active Enterprise Integration
Active Availability

what tools can be used for Active Load in teradata?

ETL Tools can use queue tables and triggers, and use FastLoad, MultiLoad and TPump utilities

iS It true that Teradata Warehouse provides both Strategic Intelligence and Operational Intelligence?

Yes.

Teradata Warehouse provides both Strategic Intelligence and Operational Intelligence.

1.supports strategic decision-making.
2.support front-line or operational decision-making

Difference between Strategic Queries and Tactical Queries?

Strategic Queries
Strategic queries represent business questions that are intended to draw strategic advantage from large stores of data.
Strategic queries are often complex queries, involving aggregations and joins across multiple tables in the database. They are sometimes long-running and tend not to have a strict service level expectation.
Strategic queries are sometimes ad hoc. They may require significant database resources to execute and they are often submitted from third-party tools.

Tactical Queries
Tactical queries are short, highly tuned queries that facilitate action-taking or decision making in a time-sensitive environment. They usually come with a clear service level expectation and consume a very small percentage of the overall system resources.
Tactical queries are usually repetitively executed and take advantage of techniques such as request (query plan) caching and session-pooling.

what are the main Characteristics of Active Data Warehouse?

An active data warehouse:
• Provides a single up-to-date view of the enterprise on one platform.
• Represents a logically consistent store of detailed data available for strategic, tactical, and event driven business decision making.
• Relies on timely updates to the critical data - as close to real time as needed.
• Supports short, tactical queries that return in seconds, alongside of traditional decision support.

what are the main Characteristics of data warehouse ?

The data in the data warehouse was:
• Subject-oriented
• Integrated
• Usually identified by a timestamp
• Nonvolatile, that is, nothing was added or removed

what is row?

A row is one instance of all the columns in a table

what is derived table?

A special type of temporary table is the derived table. You can specify a derived table in an SQL SELECT statement. A derived table is obtained from one or more other tables as the result of a subquery. The scope of a derived table is only visible to the level of the SELECT statement calling the subquery.

Using derived tables avoids having to use the CREATE and DROP TABLE statements for
storing retrieved information and assists in coding more sophisticated, complex queries.

what is Volatile table?

If you need a temporary table for a single use only, you can define a volatile table. The definition of a volatile table resides in memory but does not survive across a system restart.

Using volatile tables improves performance even more than using global temporary tables
because the system does not store the definitions of volatile tables in the Data Dictionary.
Privilege checking is not necessary because only the creator can access the volatile table.

what are global temporary tables in teradata?

Global temporary tables are tables that exist only for the duration of the SQL session in which they are used. The contents of these tables are private to the session, and the system automatically drops the table at the end of that session.

However, the system saves the global temporary table definition permanently in the Data Dictionary. The saved definition may be shared by multiple users and sessions with each session getting its own instance of the table.

IS Relational model dervied from set theory?

Yes
The relational model for database management is based on concepts derived from the
mathematical theory of sets.

what is cardinality and degree of relation?

The number of rows is the cardinality of the relation, and the number of columns is the degree

Does Teradata support ADW?

Yes . Teradata supports Active data warehousing

what is RASUI?

RASUI:

Reliability
Availability
Serviceability,
Usability,
Installability


Teradata address RASUI

Friday, February 4, 2011

what is QCD and QCF?

QCD: Query Capture Database

QCF: Query Capture Facility

Teradata VE reads the execution plan, which has been stored in the Query Capture Database (QCD), and converts the plan into a series of icons. To view an execution plan using Teradata VE, the execution plan information must first be captured into the QCD using the Query Capture Facility (QCF)

difference between INTEGERDATE and ANSIDATE?

INTEGERDATE: uses the YY/MM/DD format and
ANSIDATE uses the YYYY-MM-DD format for a date

when Mixing AND and OR operatos , OR executes first?

False........

When mixing AND with OR in the same WHERE clause, it is important to know that the AND is evaluated first.

write NOT EQUQL operator in teradata?

In where clause we can use non equal operators

<>

Teradata SQL is Case Sensitive. True / False?

False.....

How do you know version of your teradata?

we can use DBCINFO table information. its a system table stored
in DBC database

SELECT * FROM DBC.DBCINFO;

How Many Columns can be used in secondary Index creation ?

SI can consist of up to 16 columns until V2R6 and then up to 64 columns.

total time to complete SQL request is based in high performance AMP?

Nooooooooo

The time to process a single user request is always based on the slowest AMP.

More columns in PI(UPI/NUPI) means more performance?

there is a downside to using several columns for a PI. The PE needs every data value for each column as input to the hashing calculation to directly access a particular row. If a single column value is missing, a full table scan will result because the row hash cannot be recreated.

How Many Columns can be used in Primary Index creation (2)?

The Primary Index can consist of up to 16 different columns prior to V2R6 and 64 columns V2R6

how to know which database currently i am in?

Use command:

HELP SESSION or SELECT DATABASE

what is database command?

When user log on to the teradata system he is placed in a database ,
which is used while creating the user.

To access objects in this database , he can access by using name

If wants to access objects in other databases then he has
to prefix object name with database name.

Instead of doing that , he can use database command to change into
a particular database and access all objects by their names.

Syntax : database ;

Ex: database salesdb;

what is COP in teradata?

Communication Processor (COP) - The Communication Processor is also an older term used on the DBC 1012.

it used to communicate with network attached HOSTs (DOS-PC/UNIX).

What is TDP in teradata?

The Teradata Director Program(TDP) is used by the mainframe HOST to communicate with the Teradata system.

It manages all traffic between the Call Level Interface (CLI) and the Teradata System. Its functions include session initiation and termination, logging, verification, recovery, and restart, as well as physical input to and output from IFPs and COPs.

write difference between user and database in teradata?

In Teradata, a user is the same as a database with one exception. A user is able to logon to the system and a database cannot

what is the use of TEMP space?

TEMPORARY (TEMP) space :
A database may or may not have TEMP space, however, it is required if Global Temporary Tables are used.

what is common practice in creating views and tables in teradata?

It is a common practice in Teradata to have some databases with PERM space that contain only tables.

Then, other databases contain only views. These view databases require no PERM space and are the only databases that users have privileges to access.

The views in these databases control all access to the real tables in other databases. They insulate the actual tables from user access

PERM space is a pre allocated space? True/ False?

False. PERM space is not pre allocated space. its a maximum allowed space
to user or a database.

As user or database creating objects(Tables,procedures,journals,secondary indexes..etc) he uses PERM space

Can we create Macros and view in a database or user with no PERM Space?

Yes. We can. Although a database without PERM space cannot store tables, it can store views and macros because they are physically stored in the Data Dictionary (DD) PERM space and require no user storage space.

The Data Dictionary is in a "database" called DBC.

Thursday, February 3, 2011

how to install Teradata Express editions?

Teradata express edition available in two modes

1. One to install directly on windows (Xp,Vista )
2. Second Virtual machine which runs on VM player (SuSE LinuX)
here host machine can be windows or Linux. Guest OS is SUSE Linux
But your computer processor must support Intel VT (Virtual technology)
My laptop has Intel Core 2 duo T6400 64 Bit processor. I am not
able to run SUSE Linux Teradata because my CPU is not supporting
Intel VT technology

And Teradata windows Express Edition works on only 32 Bit Operating System.
It does not matter whether your CPU is 32 bit or 64 Bit your OS must be
32 bit OS.

I have 64 Bit Laptop and I installed 32bit windows vista home premium
i was able to install teradata express edition 12 and run. But when
I install Windows vista 64 Bit OS, I am not able to install it.

So I created a Windows XP Virtual Machine

Wednesday, February 2, 2011

what is OLTP System ?

Online Transaction Processing System (OLTP) is a system whose main
purpose is to capture and store the business transactions

Ex: Oracle 9i,10g,11g,IBM DB2, Microsoft SQL server RDBMS are used as
OLTP systems.

All company Business transactions goes into this database

what is Data warehouse?

A data warehouse is a system that retrieves and consolidates data periodically
from the soure systems into a dimensional or normalized data store

It usually keeps hears of history and is queried for business intelligence or
other analytical activities

what is DDS in data warehouse projecy?

DDS Means Dimensional Data Store. This is the data base where ETL loads
data from source systems into it.

Ex: Teradata is used for DDS

Here data is in Dimensional model

what are teradata purpose built platform server?

Teradata manufacture varies server for different purpose
customer can buy based on their requirement

Teradata Data Mart Applicance: 540 S, 540P

Teradata Extreme Data Appliance : 1550,1600,1650

Teradata Data warehouse Appliance : 2500,2550,2650...etc

Teradata Active Enterprise Data warehouse Appliance : 5500(C/H/E),5550(C/H),5555

and

Teradata Extreme Performace appliance : 4555 ( with solid state drives (SSD)

what are the supporting Operating System for Teradata?

Windows NT Server / Windows Server 2003

SUSE Linux

MP-RAS Unix

Can you list the backup and restore software used with teradata?

we can use

Symantec : NetBackup

Backbone / Quest software : Netvalut

Tuesday, February 1, 2011

How teradata does load balancing?

we don't. Teradata automatically balances sessions across all nodes to evenly distribute work across the entire parallel configuration. Users connect to the system as a whole rather than a specific node, and the system uses a balancing algorithm to assign their sessions to a node. Balancing requires no effort from users or system administrators.

How clients reach Teradata system?

Each Teradata node has an IP address

Client machines use hosts file windows/system32 directory. where we
list teradata nodes using specific syntax

nodeipaddress tpdidcop{1,2,3...etc}

Ex:

172.64.127.51 teradatanode1cop1
172.64.127.52 teradatanode2cop2
172.64.127.53 teradatanode3cop3

Then Client tries to connect one by one. Client may reach any one node
and the software in the node assigns a PE from any node after looking
load balancing

Expand TDPID?

Teradata Director Program Id

what is FALLBACK in teradata?

FALLBACK is a table protection feature used in case an AMP fails. You can use FALLBACK on all tables, some tables or no tables

FALLBACK tables use twice as much disk space as NON-FALLBACK rows

How many AMPs can be in a cluster?

the minimum number of AMPs per cluster is two, but the maximum number of AMPs per cluster is 16

In most systems, AMPs are clustered in a group of four.Because The chances of losing two AMPs out of four are quite low. However, if one AMP is lost, the other three will share in the extra work.


The next most popular clustering scheme is a group of three.

what is cluster in teradata?

Cluster is group of AMPs in teradata.


It is useful whenever an AMP fails.

Using cluster it is okay to loose ONE AMP.

what is RAID?

RAID :"Redundant Array of Independent Disks (RAID)".

It protects against a disk failure.

There are many levels of RAID in the data storage industry.

The most common level, and one that is used by Teradata, is RAID-1, also called MIRRORING.

With RAID-1, each primary disk has a mirror image, or an exact copy of all its data on another disk. The contents of both disks are identical.

what is a Clique in teradata?

A Clique is group of nodes in teradata. Clicke is formed by connecting
one node disks to other node via some cables.

so when one node goes down, its AMP and PE will migrate
to other node but still have an access to its own disks.....

Clicks : A group of nodes

can you expand DBC?

DBC (DBC 1012) Data Base Computer was initially introduced by Teradata

DBC: Data Base Computer

What is the best physical data model for Teradata?

Most EDW's use 3NF in teradata i.e on corporate wide database

It gives a chance to run all types of queries...

If you want to use star schema or snow flake schema we can create
data marts from teradata CDW

What is the best physical data model for Teradata?

Most EDW's use 3NF in teradata i.e on corporate wide database

It gives a chance to run all types of queries...

If you want to use star schema or snow flake schema we can create
data marts from teradata CDW

Explain star schema , FACTS and Dimension tables?

A "Star-Schema" model is comprised of a fact table and a number of dimension tables.

The fact table is a table with a multi-part key. Each element of the key is, itself a foreign key, to a single dimension table. The remaining fields in the fact table are known as facts, and are numeric, continuously valued, and additive. Facts can be thought of as measurements taken at the intersection of all of the dimensions.

Dimension attributes are mostly textual, and are almost always the source of constraints and report breaks.

This model enhances performance on known queries, or in other words, queries users run repeatedly day after day.

Friday, January 28, 2011

What iS PDE (Parallel Database Extension)?

a software interface layer on top of the operating
system that enables the database to operate in a
parallel environment.

Can you list different modules in Parsing Engine?

* Session controller
• Parser
• Optimizer
• Step Generator
• Dispatcher

what is UPSET command in teradata?

In Teradata, we can combine both UPDATE AND insert statements into
a single statement, we called it as UPSERT

what is the size of BYTEINT,SMALLINT,INTEGER?

BYTEINT - 1 Bytes - 8 bits -- > -128 to 127

SMALLINT - 2 bytes - 16 bites: ---> -32768 to 32767

INTEGER - 4 BYTES - 32 BITS ---> -2,147,483,648 to 2,147,483,647

How to see teradata version?

SELECT * FROM DBC.DBCINFO;

Is there any difference between user and database in teradata?

Database and a User are almost the same in Teradata because both can have Perm and Spool space, which allow them to hold objects. The only difference between a user and a database is that a user has a password and can logon and run queries.
A two-byte Partition number is added to the ROW-ID and it is now called a ROW KEY. The two-bytes per row will add more Perm Space to a table.

Joins to Non-Partitioned Tables can take longer and become more complicated for Teradata to perform.

Basic select queries utilizing the Primary Index can take longer if the Partition number is not also mentioned in the WHERE clause of the query.

You can't have a Unique Primary Index (UPI) if the Partition Number is not at least part of the Primary Index. You must therefore create a Unique Secondary Index to maintain uniqueness.

what are advantages of PPI?

Range queries don't have to utilize a Full Table Scan.

Deletions of entire partitions are lightning fast.

PPI provides an excellent solution instead of using secondary indexes

Tables that hold yearly information don't have to be split into 12 smaller tables to avoid Full Table Scans (FTS). This can make modeling and querying easier.

Fastload and Multiload work with PPI tables, but not with all Secondary Indexes

can you apply a read lock on table where write lock is already aplied?

No .

At what level teradata can apply a LOCK?

Teradata can applye 4 types of locks at three levels

1. database
2. user
3. rowhash

List different types of LOCKS in teradata?

Teradata can apply four types of LOCKS

1. EXCLUSIVE LOCK
2. WRITE LOCK
3. READ LOCK
4. ACCESS LOCK

what is RANGE_N?

RANGE_N is similar to CASE_N can be used for partition creations

RANGE_N can only be applied to INTEGER or DATE fields

ex: create table employee(
empno integer,
empname varchar(40),
empsal decimal(10,2),
empjoindate DATE
)
UNIQUE PRIMARY INDEX(empno,empjoindate)
PARTITION BY RANGE_N (
empjoindate between '2005-01-01' and '2006-12-31' each interval '1' month,
empjoindate between '2007-01-01' and '2008-12-31' each interval '7' day
);

How many partions in given SQL ?

Partition by CASE_N
( empSalary < 10000,
empSalary < 20000,
empSalary < 300000
empSalary < 4000000,
NO CASE OR UNKNOWN)
)

Ans: 5 Partitions

How many partions in given SQL ?

Partition by CASE_N
( empSalary < 10000,
empSalary < 20000,
empSalary < 300000
empSalary < 4000000,
NO CASE OR UNKNOWN)
)

Ans: 5 Partitions

How many partions in given SQL?

Assume SQL,

Partition by CASE_N
( empSalary < 10000,
empSalary < 20000,
empSalary < 300000
empSalary < 4000000,
NO CASE, UNKNOWN)
)

teradata creates 6 partitions....

If value is not NULL and first 4 cases are not satisfied then data goes
to "NO CASE" partition

and NULL values goes to UNKNOWN partition

Can you alter NO RANGE and Unknow partions on a table?

Yes if table is empty. we can alter NO RANGE and UnKnown Partions on empty tables

Can this SQL be okay now?

create table employee(
empno integer,
ename varchar(20),
esal
)
unique primary index(empno,empsal)
partition by case_n(
empsal<10000,
empsal<20000,
empsal<30000,
NOCASE,
Unknown
);

YESSSSSSSSSSSSSSSSSSSSSS . Now partition is created using empsal
which is part of UPI

Can this SQL be okay>

create table employee(
empno integer,
ename varchar(20),
esal
)
unique primary index(empno)
partition by case_n(
empsal<10000,
empsal<20000,
e,psal<30000,
NOCASE,
Unknown
);

NOOOOOOOOOOOOOOOOOOOOOOOOOO..... Because you created unique primary index on the
column. so your partition must be created from one of the columns used in UPI creation.

can we use CASE_N statement to define partion?

Yes. Case statement generally used to handle different cases, if case is true value to be used in SQL statements. same thing can be applied in creation of partion

for example , if empsal <10000 one partion, 10000
create table employee(
empno integer,
ename varchar(20),
esal
)
primary index(empno)
partition by case_n(
empsal<10000,
empsal<20000,
e,psal<30000,
NOCASE,
Unknown
);

)

what is the disadvantage of PPIs?

Well , if we are retrieving data based on a column which is not
part of PPI column, then AMP has to search every partions.

AMPs cant apply Binary search on ROWID. Because rows are ordered
using partition column

Ex: Assume employee table where primary index is empno and partion is created using
deptno..

select * from employee where eno=1000;

The above query goes to one AMP. But with in that AMP it can't Apply
Binary search(on Row HASH) to find row quickly. because data is ordered
using dept no.

So we have to be careful


To handle this situavations,

1. we can create a secondary index on that column (if seems to be good)

2. Include dept no also in the query

Ex: select * from employee where eno=1000 and deptno=20;

what is the use of Partition?

If you created PPI on table, then data at AMPs is ordered using Partion column

for example, if we have Partion on deptno then all dept 10 records at one place at AMP.
and dept 20 records at one place.....etc

I.e it creates partitions

When you query data on deptno, only corresponding partition will be read

Can Partion column and Primary Index should be same?

YESSSSSSSS NOOOOOOOOOOO...

we can create a Non Unique Priamry index (NUPI) on one column and chooose
other column for Partion

Ex: create an employee table with primary index on employee number and
partion on deptno.

if table has Unique Primary Index , then you should choose one of the
columns used in UPI as Partion column.

I.e If create a UPI on empno and we can't choose deptno for partion


Ex 1: This query is okay
create table employee(eno integer,ename varchar(20),deptno)
primary index(eno)
partition by dept;


Ex 2: This query is Not okay
create table employee(eno integer,ename varchar(20),deptno)
unique primary index(eno)
partition by dept;

what is a partion primary Index (PPI)?

Partion Primary Index (PPI) is a new concept introduced in V2R5 release of teradata.

users can choose PPI on table instead of UPI / NUPI.... PPI acts same as primary index to distribute rows but AMPs ordered data based on PPI column.... In case of Priamry Index AMPs order data using ROWID. In case of PPI AMPs order data using column and then ROW ID.

This is very useful for range queries.....

For example, Many times users use order date range to see orders from orders table

Ex : select * from orders where order_date between '01/01/2009' to '12/31/2009';

For these kind of queries if we can create PPI on order_date then performance will be
good

Can a user creates any object who assigned no perm space?

A user with No permspace can't create any tables in his user space. But He can create Macro, Views and triggers....

Because these objects will be stored in Global Space

what is the difference between database and user in Teradata?

Unlike Oracle, Database and user are same in teradata except that
user has password to login. database is not

Both database and user can have permspace and can store
objects

Its completely different from Oracle database

what is the use of spool space?

Spool space is useful to store intermediate and final results when AMPs
running SQL queries

what is the use of spool space?

When AMPs processing queries, to store intermediate results and
final results spool space is required

When a user query fails because of spool space?

Query may fails because of spool space in two cases

1.When There is no empty perm space. i.e complete system is full

2. User query requires more spool space than user spool space limit

When a user query fails because of spool space?

Query may fails because of spool space in two cases

1.When There is no empty perm space. i.e complete system is full

2. User query requires more spool space than user spool space limit

List different types of Spaces in teradata?

There are three important spaces:

Permspace : Permanent space for storing objects (users,tables, permanent journals,secondary indexes...etc) . Permanent space always specifies upper limit of the space can be used by database or user.

Spool Space : Space required to run queries. spool is calculated on user basis. It specifies upper limit of space can be used to run his sql queries. it comes from unused permanent space from entire system. As long as there is unused perm space in the system , spool space is available.
Each user have a limit saying how much spool space he can use for his queries.
It also calculated on per AMP basis


Temp Space : For Global Temporary Tables while running queries

who owns entire space in teradata when initially system delivered?

DBC is the User who owns entire space.

Thursday, January 27, 2011

How many rows will be returned in case of Primary Index access?

Primary Index access is always a one AMP operation.

In case of UPI, you may get Zero 0r one row

In case of NUPI, you may get Zero or Many rows

table has UPI on EMP no. I wrote a query in which i used deptno in emp table? How many amps gets this data?

Even thouh table has Primary Index , my SQL query written on
dept no on which there is not index.... So its a Full table scan

How many AMPs involved in case of Primary Index?

Its always one AMP.... Doesnt matter table has UPI or NUPI

Access data using Primary Index is always one AMP operation

How many AMPs will store a tables data for whcih Primary Inedx is created on Gender? Assume onlt two gender M or F ?

All table data goes into two AMPs.....

Male data to one AMP and Female data to another AMP

How many AMPs will store a tables data for whcih Primary Inedx is created on Gender? Assume onlt two gender M or F ?

All table data goes into two AMPs.....

Male data to one AMP and Female data to another AMP

does teradata double checks the data once it gor row hash?

Yes. Once AMP find Row with a given hash value its double checks data for integrity

Ex: Select * from student where sname='suresh';

AMP first check on row hash from PE and once it finds the row , it checks
that row has sname 'suresh' or not

Scary about its hashing ? Its is important for Integrity

Does AMP will do row by row search once it has Hash value to retrieve a row?

AMP receives a hash value from PE for which data is required.

AMP applies a Binary Search to find that row.... Its never do
linear search

Does AMP will do row by row search once it has Hash value to retrieve a row?

AMP receives a hash value from PE for which data is required.

AMP applies a Binary Search to find that row.... Its never do
linear search

what is the unique value added by AMP for UPI?

It always 1.

How long ROW ID is?

64 Bits...

AMP receives 32 bit hash value and row data from PE. AMP then append a 32 bit unique value to row HASH to make 64 bit ROW ID.

ROW ID is used to sort rows in AMP

First AMP checks is there any row with same HASH value? If yes, then it append32 bit 1 to (0000 0000 0000 0000 0000 0000 0000 0001) to row hash and make it 64 bit ROW ID... If one more row comes with same hash value then it appends 2( 0000 0000 0000 0000 0000 0000 0000 0010) to Row HASH....It goes like that

So how many rows can be inserted with same value on a column/columns on which Primary index is created ? ( 2^32).... after that? I dont Know

How logn row hash is?

32 Bits. PE generates 32 bits hash by passing primary index value to a HASH formula.
From this 32 bits, first 16 bits are used to find AMP number from HASH MAP Bucket.

Then SE sends 32 bits hash value and ROW to AMP

How Many Columns can be used in Primary Index creation?

A Primary Index can be created on a single column or Maximum of 64 columns in V2R5 and V2R6

Can a table have two Primary Indexes?

NOOOOOOOOOOOO...... Every table can have only one Primary Index. It can be either
Unique Primary Index or Non unique Primary Index.

I.e UPI or NUPI

Where Primary Index is important?

Primary Index involves in three situations

1. Data Distribution

2. To find fastest way or path to retrieve data (Very Important)

3. Very very very Important for Joins

Does Primary Index column choosing is important?

The Success of teradata warehouse starts with choosing of correct column
to create primary index on that.....

Try to Choose a column which has unique values. so that data can be distributed evenly among
all AMPs ..... Otherwise Skewness will come into picture. and also primary index is useful to get a straight path to retrieve data

Can you create a table with no Primary Index in teradata?

NO. Its not possible.... if you dont mention external priamry index, teradata creates one for you

Teradata uses this Priamry index to distribute data among AMPs

Can you create a table with no Primary Index in teradata?

NO. Its not possible.... if you dont mention external priamry index, teradata creates one for you

How many BYNETs in a teradata system?

Well, BYNET is both a hard ware and software... Hard ware BYNET connects every
PE and AMP in the sytem. It can be scalable at any time.

Every teradata system has two BYNETS . one for backup in case of one failed

However both BYNETS can be used at same time by AMP or PES

Does an AMP can connect to two disks?

NO. One AMP always connects to One virtual disk

who read and write data in teradata?

AMP works like a hard worker. AMP read and write data to their own assigned
disks.

who read and write data in teradata?

AMP works like a hard worker. AMP read and write data to their own assigned
disks.

How many sessions a PE can handle?

A parsing Engine can handle up to 120 user sessions....

So if a teradata has 2 PES, it can handle 240 users at a time....

Does your company has more than 240 users who needs teradata connections
at same time?

can you explain and PE and AMPs communicate?

When user connects to teradata database he opened a session with parsing Engine(PE)

there after when user submits a query,

1. First PE takes query, checks syntax, verifies his access rigthts
2. If every thing looks okay, PE prepare an action plan for AMP.
Which AMP to respond , what is row ID to read ...etc
3. Then PE sends action plan to AMP via BYNET signals.
4. Then corresponding AMP takes action plan and reads data and send back to PE
5. Then PE sends data to USER

Does DBC has any history?

Every teradata system when initially shifted to client, it has one user called " DBC" who
is Super DBA and owns all space in Terdata system.

Initially teradata was successed well on DBC/1012 computer which was developed/manufactured
by teradata

Most databases were born to be OLTP. Teradata born to be parallel.

Does teradata warehouse user uses Start or snow flake schemas?

Interesting : Because Teradata supports high speed parallelism and tera data recommends
for 3NF tables. Most warehouses use third normal form tables instead of Star or snow flake schemas.

Then How OLAP softwares get Start schemas, which is heart for them?

When for OLAP softwares, people creates start schemas using Views.

How teradata got its name?

The first developer of this datawase aim was to support to Tera bytes(10^12) of data. so they named their database as Teradata.

They first installed their production copy at Wells Fargo

what is VPROC in Teradata?

VPROC means virtual processor in teradata. its just a process in teradata system. It imitates
a processor by taking instructions from other proeces.

Teradata has two types of VPROCS

1. PE ( Parsing Engine) , Written in C .
2. AMP (Access Module Processor)

some times PDE ( Parallel database extension) process is also called VPROC.

what is teradata?

Teradata is a RDBMS system like other databases oracle,db2... but built with Parallelism and
supports terabytes of data. It mainly developed for Data warehousing area..... Parallelism or parallel processing is heart of Tera data which comes with Share Nothing architecture

Teradata loads data in parallel, export in parallel, read in parallel

Read Write and every thing is parallel

Wednesday, January 26, 2011

Tuesday, January 25, 2011

Can you tell two macros in teradata?

SHOW and HELP are two macros

what are default login details after installting teradata?

For BTEQ:

tp ID : localtd
user name: tduser
password: tduser

ex: .logon localtd/tduser,tduser; Press enter

From Tera data administrator console,

we can login using, tduser or tdadmin

tduser and tdadmin are DSN s created on ODBC connection area

Useful URLs

Teradata Documentation

http://www.info.teradata.com/

http://www.info.teradata.com/templates/eSrchResults.cfm?prodline=&txtpid=&txtrelno=&txtttlkywrd=TTU13.10&rdsort=Title&srtord=Asc&nm=Teradata+Tools+and+Utilities+13.10

http://teradata.uark.edu/doc2.html


http://www.coffingdw.com/

http://banglahouse.tripod.com/tutorial/intro2td.htm

http://www.scribd.com/doc/7266269/Teradata-Best-Practices-Using-a-711


http://www.coffingdw.com/Teradata_Basics/teradata_basics.htm#chapter_9___advanced_topics_that_you_will_be_tested_on/teradata_cabinets_nodes_vprocs_and_disks.htm


http://www.teradatatips.info/

http://teradataquestions.com/

Thursday, January 20, 2011

what is Full table scan (FTS) ?

A Full Table Scan (FTS) is a query that reads every row of a table

How many columns can be used in secondary index?

16?

How many columns can be used in primary Index?

PRIMARY INDEXES may be defined on one column, or on a set of columns viewed as a composite unit. Up to 16 columns may be defined as a Primary Index.

How Many secondary indexes can be created on table?

We can create up to 32 secondary indexes

How Many secondary indexes can be created on table?

We can create up to 32 secondary indexes

How many primary indexes can be created on a table?

Only One

can you create a secondary index after table is created?

Yes. We can create secondary indexes (USI/Ø¢ NUSI ) at any point of time

can you create Primary index after a table is created?

No. Its not possible. If you want to create a primary index we should
create at time of table creation. if you are not creating any primary index
system creates a default one.

We are not able to edit primary index once table is created. to edit a
UPI/ NUPI you need to recreate table

In case of NUSI, does AMP has local subtable?

Yes. In case of NUSI, AMP is going to have a subtable
which has only information for the rows in the same AMP.

But in case of USI, Each row in AMP is hashed and decide AMP number
to store its information in Secondary index Sub table

In case of NUSI, does AMP has local subtable?


Yes. In case of NUSI, AMP is going to have a subtable
which has only information for the rows in the same AMP.

But in case of USI, Each row in AMP is hashed and decide AMP number
to store its information in Secondary index Sub table

Teradata Join Indexes equal to what kind of concept in oralce?

Join Indexes on Teradata similar to materalized views in Oracle

Teradata Join Indexes equal to what kind of concept in oralce?

Join Indexes on Teradata similar to materalized views in Oracle

List different types of indexes in Teradata?

Primary Indexes: Unique Primary Index(UPI) , Non Unique Primary Index( NUPI)

Secondary Index : Unique Secondary Index(USI), Non Unique secondary Index(NUSI)

Join Index


1. Each table may only have one Primary Index, but every table must have a Primary Index defined.
2. PRIMARY INDEXES may be defined on one column, or on a set of columns viewed as a composite unit. Up to 16 columns may be defined as a Primary Index.
3. A table can have 32 secondary indexes.
4.Once created, the primary index cannot be dropped or modified, the index must be changed by recreating the table
5. Unlike the primary index, a secondary index can be added or dropped without recreating the table.

waht are different types of spaces in teradata?

There are three types of space with Teradata. They are:
  • Add a note herePerm Space,

  • Add a note hereSpool Space, and

  • Add a note hereTemp Space

Add a note here"Perm space" defines the upper limit of space that a database or user can use to hold tables, secondary index sub-tables, and permanent journals

Add a note hereSpool space defines the upper limit of space that a user has to run a query. When a user runs a query, AMPs build the answer set in spool space. Once the query is done, the spool space is released. If the query exceeds the spool space's upper limit, the query aborts. Then, the user is out of spool space.

Add a note hereTemp space defines the upper limit that a user or database can have to hold Global Volatile Temporary tables.

What's the difference between a database and a user?

Remember either a database or a user can own space

Unlike other database products, Teradata sees little difference between a user and a database. Both need space to contain or own data. In fact, the only real difference is that a user has a password and he or she can log-on and submit SQL requests.

Both a database and a user can own perm space; therefore both can actually own tables.

what is DBC in System?

Every Teradata system that was ever built has a user called "DBC." The acronym is derived from the first Teradata machine called the "DBC/1012". DBC stands for Database Computer, and 1012 stands for 10 to the 12th power – or a Terabyte. There is no user with greater privileges than the DBC.

How many table can you join in a single query in teradata?

Teradata can join up to 64 tables in a single query.

What is the difference between a PRIMARY KEY and a PRIMARY INDEX?

A Primary Key is a logical term used to label column(s) that enforce the uniqueness of each row in a table. PKs determine relationships among tables. A Primary Index is a physical term used to label column(s) that is used to store and locate rows of data

Does every table has primary index in Teradata?

PI is so important to Teradata functionality that every table in the database is required to have one
The Main uses of PI are
  1. Add a note here The Primary Index WILL DETERMINE which rows go to which AMPs; and

  2. Add a note here The Primary Index is ALWAYS the FASTEST RETRIEVAL method.

if the user doesn't define a PRIMARY INDEX when creating a table, the system will automatically choose one by default. Once it is defined, the PI column cannot be dropped or changed. The table would need to be re-created in order to change the PI.

What is BYNET in teradata?

the Bynet is an internal Teradata network over which the PEs and the AMPs communicate

What forms the relationship between the tables in a relational database?

A key that is common to each table forms it. A "Foreign Key (FK)" is a key in a table that is a Primary Key (PK) in another table. The PK and FK relationship allows the two tables to relate to one anothe

Tuesday, January 18, 2011

can you list some vendors in data mining?

SAS , SPSS, IBM Intelligent Miner, Microsoft SQL server data mining, oraclem, Angoss,KXEN

what is data mining?

data mining is about analyzing data and finding hidden patterns using automatic or semiautomatic ways/methods

data mining is key member in Business Intelligence product family