How To Exam?

a knowledge trading engine...


Jawaharlal Nehru Technological University Kakinada 2009-1st Sem B.Tech Computer Science and Engineering IV Regular s, DATA WAREHOUSING AND DATA MINING (Computer Science & Engineering) -

Friday, 09 August 2013 10:15Web


 

Code No: M0502

 

 

 

 

 

 

 

Set No. 1

 

IV B.Tech I Semester Regular Examinations, November 2009

DATA WAREHOUSING AND DATA MINING

(Computer Science & Engineering)

Time: 3 hours

Max Marks: 80

Answer any FIVE Questions

All Questions carry equal marks

 

 

 

1. (a) Explain data mining as a step in the process of knowledge discovery.

(b) Dierentiate operational database systems and data warehousing.[8+8]

 

2. (a) Briefly discuss the forms of Data preprocessing with neat diagram.

(b) Briefly discuss the parametric and non- parametric methods of Numerosity

reduction..[8+8]

 

3. Explain the syntax for the following data mining primitives:

 

(a) Task-relevant data

(b) The kind of knowledge to be mined

(c) Interestingness measures

(d) Presentation and visualization of discovered patterns[16]

 

4. (a) How can we perform attribute relevant analysis for concept description? Ex-

plain.

(b) Briefly explain about the presentation of class comparison descriptions. [8+8]

 

5. Compare and contrast the dierences between mining single dimensional Boolean

Association rules and multilevel Association rules for transactional databases. [16]

 

6. (a) Why naive Bayesian classification called naive? Briefly outline the major

ideas of naive Bayesian classification.

(b) Define regression. Briefly explain about linear, non-linear and multiple regres-

sions.[8+8]

 

7. (a) Use a diagram to illustrate how, for a constant MinPts value, density-based

clusters with respect to a higher density (i.e., a lower value for ε , the neigh-

borhood radius) are completely contained in density- connected sets obtained

with respect to a lower density.

(b) Give an example of how specific clustering methods may be integrated, for

example, where one clustering algorithm is used as a preprocessing step for

another.1 of 2[8+8]

 

 

8. (a) Explain similarity search in multimedia data.

(b) Explain similarity search in time-series analysis.1 of 2

(c) What is meant by authoritative web pages? Explain about mining the webs

link structures to identify authoritative web pages. [5+6+5]

 

 

 

 

 

 

 

 

 

Code No: M0502

 

 

 

 

 

 

 

Set No. 2

 

IV B.Tech I Semester Regular Examinations, November 2009

DATA WAREHOUSING AND DATA MINING

(Computer Science & Engineering)

Time: 3 hours

Answer any FIVE Questions

All Questions carry equal marks

Max Marks: 80

 

 

1. (a) Explain the ecient computation of data cubes.

 

(b) Discuss the ecient processing of OLAP queries.[8+8]

 

2. Briefly discuss the Discretization and concept hierarchy techniques.[16]

 

3. Explain the syntax for the following data mining primitives:

 

(a) Task-relevant data

(b) The kind of knowledge to be mined

(c) Interestingness measures

(d) Presentation and visualization of discovered patterns.[16]

 

4. (a) Dierentiate attribute generalization threshold control and generalized rela-

tion threshold control.

(b) Dierentiate between predictive and descriptive data mining.[8+8]

 

5. Propose a method for mining hybrid-dimension association rules (multidimensional

association rules with repeating predicates)and explain with an example. [16]

 

6. (a) Why naive Bayesian classification called naive? Briefly outline the major

ideas of naive Bayesian classification.

(b) Define regression. Briefly explain about linear, non-linear and multiple regres-

sions.[8+8]

 

7. The following table contains the attributes name, gender, trait-1, trait-2, trait-3,

and trait-4, where name is an object-id, gender is a symmetric attribute, and the

remaining trait attributes are asymmetric, describing personal traits of individuals

who desire a penpal. Suppose that a service exists that attempt to find pairs of

compatible penpals.

 

Name gender trair-1 trait-2 trait-3 trait-4

Kevan M N P P N

Caroline F N P P N

Erilk M P N N P

. . . . . .

. . . . . .

. . . . . .

For asymmetric attribute values, let the value P be set to 1 and the value N be set

to 0. Suppose that the distance between objects (potential penpals) is computed

based only on the asymmetric variables.

 

(a) Show the contingency matrix for each pair given Kevan, Caroline, and Erik.

(b) Compute the simple matching coecient for each pair.

(c) Compute the Jaccard coecient for each pair.

(d) Who do you suggest would make the best pair of penpals? Which pair of

individuals would be the least compatible. [4+4+4+4]

 

8. (a) What is multimedia database? Explain mining multimedia databases.

(b) What is a time-series database? What is a sequence database? Explain mining

time-series and sequence data.[8+8]

 

 

 

 

 

 

Code No: M0502

 

 

 

 

 

 

 

Set No. 3

 

IV B.Tech I Semester Regular Examinations, November 2009

DATA WAREHOUSING AND DATA MINING

(Computer Science & Engineering)

Time: 3 hours

Max Marks: 80

Answer any FIVE Questions

All Questions carry equal marks

 

 

 

1. (a) Explain data mining as a step in the process of knowledge discovery.

(b) Dierentiate operational database systems and data warehousing..[8+8]

 

2. Explain various data reduction techniques[16]

 

3. The four major types of concept hierarchies are: schema hierarchies, set-grouping

hierarchies, operation-derived hierarchies, and rule-based hierarchies.

 

(a) Briefly define each type of hierarchy.

(b) For each hierarchy type, provide an example.[16]

 

4. (a) Dierentiate attribute generalization threshold control and generalized rela-

tion threshold control.

(b) Dierentiate between predictive and descriptive data mining.[8+8]

 

5. (a) Explain about constraint-based Association mining.

(b) Give an example for Association rule mining? Classify Association rules.[8+8]

 

6. (a) Given a decision tree, you have the option of (i) converting the decision tree

to rules and then pruning the resulting rules, or (ii) pruning the decision tree

and then converting the pruned tree to rules. What advantages does former

option have over later one. Explain.

(b) Can any ideas from association rule mining be applied to classification? Ex-

plain.[8+8]

 

7. Explain the following:

 

 

(a) DBSCAN

(b) OPTICS

(c) DENCLUE

(d) BIRCH. [4+4+4+4]

 

 

8. (a) What is spatial data warehouse? What are the dierent types of dimensions

in a spatial data cube? What are the dierent types of measures in a spatial

data cube?

(b) What is keyboard-based association analysis? How can automated document

classification be performed?

 

(c) Briefly discuss about mining the World Wide Web.[2+2+2+2+2+6]

 

 

 

 

 

 

Code No: M0502

 

 

 

 

 

 

 

Set No. 4

 

IV B.Tech I Semester Regular Examinations, November 2009

DATA WAREHOUSING AND DATA MINING

(Computer Science & Engineering)

Time: 3 hours

Max Marks: 80

Answer any FIVE Questions

All Questions carry equal marks

 

 

 

1. (a) Describe three challenges to data mining regarding data mining methodology

and user interaction issues.

(b) Explain Indexing OLAP data..[8+8]

 

2. Explain various data reduction techniques.[16]

 

3. (a) Discuss the various forms of visualizing the discovered patterns.

(b) Discuss about the task-relevant data specification[8+8]

 

4. Suppose that the data for analysis include the attribute age. The age values for

the data tuples are (in increasing order):

13,15,16,16,19,20,20,21,22,22,25,25,25,25,30,33,33,35,35,35,35,36,40,45,46,52,70.

 

 

(a) What is the mean of the data?

(b) What is the median?

(c) What is the mode of the data? Comment on the datas modality.

(d) What is the mid range of the data?

(e) Can you find (roughly) the first quartile(Q1),and third quartile(Q3) of the

data?

(f) Give the five number summaries of the data.

(g) Show a box plot of the data.

(h) How is the quantile-quantile plot dierent from a quantile plot?[16]

 

5. Sequential patterns can be mined in methods similar to the mining of association

rules. Design an ecient algorithm to mine multilevel sequential patterns from

a transaction database. An example of such a pattern is the following A customer

who buys a PC will buy Microsoft software within three months, on which one

may drill down to find a more refined version of the patterns, such as A customer

who buys a Pentium PC will buy Microsoft oce within three months. [16]

 

6. (a) What is classification? What is prediction? Describe issues regarding classifi-

cation and prediction.

(b) Explain Bayesian belief networks. How does a Bayesian belief network train?

[8+8]

 

7. (a) Write algorithms for k-Means and k-Medoids. Explain.

 

 

(b) Discuss about density-based methods.[8+8]

 

8. (a) Explain the classification and prediction analysis of multimedia data.

(b) What are basic measures for text retrieval? What methods are there for

information retrieval?

(c) What is meant by authoritative Web pages? Explain about mining the Webs

link structures to identify authoritative web page. [4+6+6]

 

 

 


( 0 Votes )

Add comment


Security code
Refresh

Earning:   Approval pending.
You are here: PAPER Jawaharlal Nehru Technological University Kakinada 2009-1st Sem B.Tech Computer Science and Engineering IV Regular s, DATA WAREHOUSING AND DATA MINING (Computer Science & Engineering) -