Uttar Pradesh Technical University (UPTU) 2007 M.C.A Data Mining & Warehousing - Question Paper
Printed Pages : 3 MCA - 404(5)
PAPER ID : 1479
(Following Paper ID and Roll No. to be filled in your Answer Book)
Roll No.
(SEM. IV) EXAMINATION, 2006-07 DATA MINING & WAREHOUSING
3 Hours] [Total Marks : 100
Time
Note
Attempt all questions.
2x10=20
Attempt any two :
(a) (i) Define the data warehouse
(ii) How is a data warehouse different from a database? How are they similar?
(b) (i) What are the different phases of the
knowledge discovery from data base?
(ii) What data mining attempts to facilitate?
(c) What are schemas. Discuss various schemas used in data warehouse.
2 Attempt any two : 2x10=20
(a) What is OLAP? Explain various OLAP operations used on a data cube.
(b) Apply the Apriori algorithm on the following data set and find the maximum frequent item set. Set having support 20% or more are frequent item set.
Show two association rules that have a confidence 70% or greater.
V-1479] 1 [Contd...
Trans ID |
10 |
11, 12, |
14 | |
20 |
11, 17 | |||
30 |
17, 15 | |||
40 |
11, 12, |
15 | ||
50 |
16, 14 | |||
60 |
16 | |||
70 |
16, 17 | |||
80 |
11, 12, |
13, 14 | ||
90 |
13, 15 | |||
00 |
11, 12 |
(c) Define the Association rule mining. How market basket analysis forms the association rules? Discuss basic concepts.
(a) What are classification rules and how are decision trees related to them ?
(b) What is data classification? How it is differ than prediction ?
(c) Describe the ID3 algorithm of the decision tree construction. Why is it unsuitable for data mining applications?
(d) How hypothesis testing and refinement task can be done in datamining using gentic algorithm.
(e) Describe neural networks techniques for data mining. What are the main difficulties in using these techniques ?
(f) What is Bayesian classification ? How it classifies the Input data ?
4 Attempt any two parts : 2x10=20
(a) How does clustering differ from classification?
(b) What is supervised and unsupervised learning ?
Why clustering is known as unsupervised learning ?
(c) Describe the genetic algorithms as data mining techniques. What are the main difficulties in using these techniques ?
5 Attempt any two parts : 2x10=20
(a) What is backpropagation Neural Network topology? How it is used in classification?
(b) Compare hierarchical clustering and non hierarchical clustering algorithm. Explain the advantages and disadvantages over each other.
(c) Write differences between the Nearest Neighbour Data Mining Techniques and Clustering.
V-1479] 3 [ 3165 ]
Attachment: |
Earning: Approval pending. |