How To Exam?

a knowledge trading engine...


DOEACC Society 2006 DOEACC C Level CE3 - Data Warehousing & Mining( ) - Question Paper

Friday, 14 June 2013 02:00Web

CE3-R3: DATA WAREHOUSING AND MINING
NOTE:
Time: three Hours Total Marks: 100
1. describe and discuss subsequent basic data mining tasks.
a) Classification
b) Clustering
c) Prediction
d) Link Analysis
e) Time Series Analysis
f) OLAP
g) Knowledge Discovery
(7x4)
2.
a) discuss life cycle of data warehouse development
b) define benefits and drawbacks of a source-driven architecture for gathering of data at
a data-warehouse, as compared to a destination-driven architecture.
(9+9)
3.
a) describe decision tree. Write and discuss decision tree development algorithm with an
improper example.
b) describe and discuss Bayesian classification scheme.
c) Detail the improvements made by either C4.5 or CART in the basic decision tree
algorithm.
(6+6+6)
4.
a) The 3 kinds of concept hierarchies are: schema hierarchies, set grouping hierarchies
and rule-based hierarchies. Briefly describe every kind of hierarchy (giving suitable
example).
b) Suppose that a dataware house consists of the 3 dimensions time, doctor and
patient and the 2 measures count and charge where charge is the fee that a doctor
charges for a patient. Draw a schema diagram for the dataware using snowflake
schema.
c) How can rules be extracted from a decision tree?
(6+6+6)
5.
a) Suppose half of all the transactions in a clothes shop are for purchase of jeans, and one
third of all transactions in the shop are for purchase of T-shirts. Suppose also that half of
the transactions that for purchase of jeans also for purchase of T-shirts. Write down all
the (nontrivial) association rules you can deduce from the above information, giving
support and confidence of every rule.
CE3-R3 Page one of two July, 2006
1. ans ques. one and any 4 ques. from two to 7.
2. Parts of the identical ques. should be answered together and in the identical
sequence.
b) Compare the advantages and disadvantages of (i) K-means and (ii) K-medoids for
clustering. explain a main challenge common to both the K-means and K-medoids
algorithms.
c) Why is the NBC algorithm called Naive? discuss.
(6+6+6)
6.
a) Why is tree pruning useful in tree induction? What is drawback of using a separate set
of samples to evaluate pruning.
b) Compare the advantages and disadvantages of eager classification verses lazy
classification. Classify the subsequent techniques into eager and lazy classification: K
closest neighbor, decision tree, Bayesian, neural network, case based reasoning.
c) A data warehouse consists of 4 dimensions date, spectator, location and game and
the 2 measures are count and charge, where charge is the fare that a spectator pays
when watching a game on a provided date. Spectators may be students, adults or seniors
with every category having its own charge rate. Draw a star schema for the data
warehouse.
(6+6+6)
7. Write short notes on:
a) Data Mining Query Language
b) Iceberg queries
c) Mining Spatial Databases
(3x6)
CE3-R3 Page two of two July, 2006


( 0 Votes )

Add comment


Security code
Refresh

Earning:   Approval pending.
You are here: PAPER DOEACC Society 2006 DOEACC C Level CE3 - Data Warehousing & Mining( ) - Question Paper