Jaypee Institute of Information Technology (JIIT) 2008 B.E Information Technology Test 1 : Data Mining - Question Paper
Jaypee Uoiversity of Information Technology
Waknaghat
(3+3}Q.t
B.Tech.,VI Sem (I.T.) Tesl-1, February 2008
Maximum Time : 1 Hour Maximum Marks: 20
Course codc : 07B62CU04 Course Title : Data Mining Course Credit: 4 Note: Attempt ALL questions.
(a)Lisi three commonly used statistical measures for the characterization of data dispersion, and discuss how they can be computed efficiently in large databases.
(b)Propose an algorithm, in pseudocode of a concept hierarchy for numeric data based on the equiwidth partitioning rule.
Jaypee Uoiversity of Information Technology
Waknaghat
(3+3}Q.t
B.Tech.,VI Sem (I.T.) Tesl-1, February 2008
Maximum Time : 1 Hour Maximum Marks: 20
Course codc : 07B62CU04 Course Title : Data Mining Course Credit: 4 Note: Attempt ALL questions.
[2+2+2]
Q-2
(a)ln real-world data, tuples with missing values for some attributes arc a common occurrence Dcscribc various methods for handling them
(b)A popular data warehouse implementation is to construct a multidimensional database, known as a data cube. Unfortunately, this may often generate a huge, yet very sparse multidimensional matrix. Present an example illustrating such a huge and sparse data cube.
(c)Describc how a box plot can give information about whether the value of an attribute is symmetrically distributed.
Jaypee Uoiversity of Information Technology
Waknaghat
(3+3}Q.t
B.Tech.,VI Sem (I.T.) Tesl-1, February 2008
Maximum Time : 1 Hour Maximum Marks: 20
Course codc : 07B62CU04 Course Title : Data Mining Course Credit: 4 Note: Attempt ALL questions.
(2+ 3+ 31
Q.3
Suppose that a data warehouse consists of the three dimensions time, doctor and patient, and the two measures count and charge, where charge is fee that a doctor charges to a patient for a visit.
(a)Enumerate three classes of schemas that are popularly used for modeling data warehouses.
(b)Draw a schema diagram for the above data warehouse using one of the schema classcs listed in (a).
(c)Starting with (he base cuboid (day, doctor, patient], what specific OLAP operations should be performed in order to list the total fee collected by each doctor in 2007 ?
Attachment: |
Earning: Approval pending. |