DATA MINING & DATA WAREHOUSING
MCA II Year, II Semester
UNIT - I
1. Explain Data warehouse? Explain all the OLAP operations
for a multidimensional data model with an
example. (Feb 12)
2. i. Describe the concept of frequent sets, Confidence and
support. (Feb 12)
ii. Describe the working of Pinler Search Algorithm.
3. i Explain Datawarehouse Bus Architecture? (Aug 11)
ii Describe the main activities associated with various
Design steps of Datawarehousing
4. What is Dimension Modelling? How it is different from ER
Modelling? (Aug 11)
5. i What is OLAP? (July 11)
ii What is Incremental Mining? Discuss.
6. i. Explain Data Mining related issuses and future Trends?
(July 11)
ii. Discuss about various FIM Algorithms?
7. Differentiate between the following:
a. Data warehousing and Data mining b. Classification and
Clustering
c. Classification and regression. d. Data mining and
statistics.
8. What is incremental pattern discovery? How it is done?
9. What is frequent pattern mining? What is it useful for?
10. What is the relationship between data compression and
data mining?
11. What are the difficulties associated with each of the
two phases of association rule mining?
12. Prove that the partition algorithm is correct
13. What is the intuition behind conviction as a measure of
interestingness of association rules?
14. Define negative border. How is it used in the sampling
algorithm?
UNIT - II
1. i. What is Decision Tree? What are the three phases of
construction of a decision tree? Describe the importance
of each of the phara. (Feb 12)
ii What are the advantages and disadvantages of decision
tree approach over the other approaches of data
mining?
2. i. What is clustering? What are the different Clustering
Techniques? (Feb 12)
ii. Explain Hierarchical Clustering?
3. i. Explain Technical-Archtecture overview of data
warehouse. Illustrate with figure. (Aug 11)
ii. Describe three Desktop tool Architecture Approach.
4. What is a metadata? Explain different types of mateadata.
Give an active metadata example. (Aug 11)
5. i. Briefly discuss about classification and its
applications? (July 11)
ii. Explain about optimal Classification Algorith?
6. i. Discuss about Measurement of Similarity? (July 11)
ii. Write notes on Density based and Grid Based methods for clustering?
7. Describe the basic notions of classification, training
dataset, test dataset, and accuracy to a non-technical
friend or relative.
8. Why is cross validation useful in evaluating a
classifier?
9. In what ways can a classifier handle missing data?
10. Briefly describe the Naïve Bayes and k-NN approaches to
classification.
11. Define the problem of clustering. How does it differ
from classification?
12. Give two example applications of clustering that you
have come across in your day-to-day life.
13. What is precision and recall? How do they differ from
accuracy?
14. Provide two example applications of classification.
15. Explain Regression
16. Explain the clustering algorithms.
17. Explain Grid based Methods
18. Explain outlier detection
19. What is a Decision Tree?
20. What are the advantages and disadvantages of decision
Tree approach over other approaches of data
mining.
UNIT – III
1. i. What are the issuses and challenges of data mining?
(Feb 12)
ii. What are the 5 external trends affecting data mining?
2. i. Explain the information flow mechanism in DWH? (Feb
12)
ii. Explain the life cycle of data?
3. Describe the process of Developing physical data model?
(Aug 11)
4. i. What is aggregation? How this is used in DWH
enviornment? (Aug 11)
ii. Discuss various data cleaning methods with example.
5. i. Write short notes on application of data mining. (July
11)
ii. Discuss the demand for Strtegic information and life
cycle of data.
6. i. Describe the Information Flow Mechanism? (July 11)
ii. Briefly explain about Meta data, classes of data and
DATA Warehousing?
7. List the advantages, disadvantages and issues in data
mining.
8. Write a short note on various trends that affect the data
mining technology.
9. Explain the term information crisis.
10. Differentiate between operational systems and
informational system.
11. Data warehouse is an environment not a product. Comment.
12. Write a short note on benefits of data warehousing?
13. Write a short note on the importance of metadata in a
data warehouse.
14. What do you understand by the term data granularity?
Give some advantages and disadvantages of
keeping detailed data in the data warehouse.
15. Explain the difference between different types of source
data.
16 A data warehouse cuts across several applications in the
source systems. Comment.
17. What are the issuses and challenges of data mining?
18. Explain the information flow in DWH?
19. Explain the life Cycle of Data?
20. Discuss the demand for Strtegic Information and data
Warehousing.
UNIT – IV
1. i. What are the characteristics of data warehouse
architecture? (Feb 12)
ii. Explain about Data Mart Issuses and Building data Mart
Issuses?
2. Explain star, snow flake, fact constellationn and give
their prons and cons. (Feb 12)
3. i. Discuss FP-Tree growth algorithm for discovering
association rules? (Aug 11)
ii. Give a short example to show that items in a strong
association rule may actually be negatively .correlated?
4. What is clustering? Describe various clustering based
approach briefly. Give one example for each one.
(Aug 11)
5. i. Explain about Data Mart Issuses and Building Data
marts? (July 11)
ii. Describe the architecture of a Data Warehouse.
6. i. Briefly explain the characteristics of a Dimension Table?
(July 11)
ii Discuss about Fact Tables and Cyclicity of Data.
7. Explain the architecture of a data warehouse.
8. Compare two-tier and three-tier architecture of a data
warehouse.
9. Explain how a data warehouse differs from data mart.
10. Compare top-down approach with bottom-up approach of
building data marts. Explain the practical
approach.
11. Explain dimensional modeling.
12. Write a short note on star schema. Explain it with a
relevant example. Mention some of its advantages and
shortcomings.
13. i. List the features of a fact table and a dimension
table.
ii. Write a short note on fact less fact table. Draw a star
schema representing a fact less fact table of a patient
visiting a hospital.
14. Differentiate between fully-additive measures,
semi-additive measures and non-additive measures with
example.
15. Explain the snowflake schema with its advantages and
disadvantages. Also make a comparison between
the star schema and snowflake schema.
16. What are aggregate fact tables? Write a short note on
snapshot and transaction tables.
UNIT - V
1. i. What is a Fact Table? What are the characteristics of
a Fact Table? (Feb 2012)
ii. Explain the Factless Table with an example. (Feb 12)
2. Explain ETL Process (Feb 12)
3. i. What are the advantages and disadvantages of Decision
Tree Classification? (Aug 11)
ii. Explain Best Split, splitting indices, splitting
criteria for decision tree classification. Issustrate with an
example.
4. i. Whate is neural networks? Explain the unsupervised
learning. (Aug 11)
ii. How do you handle spatial and non-spatial data, while
carrying out any mining task?
5. i. Write short notes on performance enhancement using
DWSchema. (July 11)
ii. Describe the ETL Process.
6. i. Discuss the MultiDimensional Analysis, Functions and
applications of OLAP. (July 11)
ii. Explain about tools, Products and Data Design of OLAP in
Data Warehousing.
7. List some activities that are a part of the ETL process.
List the steps followed in the ETL process.
8. Write a short note on the different data extraction
techniques.
9. Explain the merits of having quality data in your data
warehouse. Discuss some of the side effects of
having poor quality data.
10. Write a short note on data quality tools. How will you
categorize these tools? Explain.
11. Define the terms: Initial load, incremental load, full
refresh and update.
12. List the functions that form a part of data
transformation process.
13. Write a short note on multidimensional analysis.
14. Define an OLAP system. Also write down some of its
essential characteristics.
15. Write a short note on OLAP tools and products.
16. Explain the issues in OLAP administration.
17. Briefly explain the different models of OLAP.
Differentiate ROLAP and MOLAP
No comments:
Post a Comment