Saturday, June 23, 2012

DATA MINING & DATA WAREHOUSING 
MCA II Year, II Semester

UNIT - I
1. Explain Data warehouse? Explain all the OLAP operations for a multidimensional data model with an
example. (Feb 12)
2. i. Describe the concept of frequent sets, Confidence and support. (Feb 12)
ii. Describe the working of Pinler Search Algorithm.
3. i Explain Datawarehouse Bus Architecture? (Aug 11)
ii Describe the main activities associated with various Design steps of Datawarehousing
4. What is Dimension Modelling? How it is different from ER Modelling? (Aug 11)
5. i What is OLAP? (July 11)
ii What is Incremental Mining? Discuss.
6. i. Explain Data Mining related issuses and future Trends? (July 11)
ii. Discuss about various FIM Algorithms?
7. Differentiate between the following:
a. Data warehousing and Data mining b. Classification and Clustering
c. Classification and regression. d. Data mining and statistics.
8. What is incremental pattern discovery? How it is done?
9. What is frequent pattern mining? What is it useful for?
10. What is the relationship between data compression and data mining?
11. What are the difficulties associated with each of the two phases of association rule mining?
12. Prove that the partition algorithm is correct
13. What is the intuition behind conviction as a measure of interestingness of association rules?
14. Define negative border. How is it used in the sampling algorithm?
UNIT - II
1. i. What is Decision Tree? What are the three phases of construction of a decision tree? Describe the importance
of each of the phara. (Feb 12)
ii What are the advantages and disadvantages of decision tree approach over the other approaches of data
mining?
2. i. What is clustering? What are the different Clustering Techniques? (Feb 12)
ii. Explain Hierarchical Clustering?
3. i. Explain Technical-Archtecture overview of data warehouse. Illustrate with figure. (Aug 11)
ii. Describe three Desktop tool Architecture Approach.
4. What is a metadata? Explain different types of mateadata. Give an active metadata example. (Aug 11)
5. i. Briefly discuss about classification and its applications? (July 11)
ii. Explain about optimal Classification Algorith?
6. i. Discuss about Measurement of Similarity? (July 11)
ii. Write notes on Density based and Grid Based methods for clustering?
7. Describe the basic notions of classification, training dataset, test dataset, and accuracy to a non-technical
friend or relative.
8. Why is cross validation useful in evaluating a classifier?
9. In what ways can a classifier handle missing data?
10. Briefly describe the Naïve Bayes and k-NN approaches to classification.
11. Define the problem of clustering. How does it differ from classification?
12. Give two example applications of clustering that you have come across in your day-to-day life.
13. What is precision and recall? How do they differ from accuracy?
14. Provide two example applications of classification.
15. Explain Regression
16. Explain the clustering algorithms.
17. Explain Grid based Methods
18. Explain outlier detection
19. What is a Decision Tree?
20. What are the advantages and disadvantages of decision Tree approach over other approaches of data
mining.
UNIT – III
1. i. What are the issuses and challenges of data mining? (Feb 12)
ii. What are the 5 external trends affecting data mining?
2. i. Explain the information flow mechanism in DWH? (Feb 12)
ii. Explain the life cycle of data?
3. Describe the process of Developing physical data model? (Aug 11)
4. i. What is aggregation? How this is used in DWH enviornment? (Aug 11)
ii. Discuss various data cleaning methods with example.
5. i. Write short notes on application of data mining. (July 11)
ii. Discuss the demand for Strtegic information and life cycle of data.
6. i. Describe the Information Flow Mechanism? (July 11)
ii. Briefly explain about Meta data, classes of data and DATA Warehousing?
7. List the advantages, disadvantages and issues in data mining.
8. Write a short note on various trends that affect the data mining technology.
9. Explain the term information crisis.
10. Differentiate between operational systems and informational system.
11. Data warehouse is an environment not a product. Comment.
12. Write a short note on benefits of data warehousing?
13. Write a short note on the importance of metadata in a data warehouse.
14. What do you understand by the term data granularity? Give some advantages and disadvantages of
keeping detailed data in the data warehouse.
15. Explain the difference between different types of source data.
16 A data warehouse cuts across several applications in the source systems. Comment.
17. What are the issuses and challenges of data mining?
18. Explain the information flow in DWH?
19. Explain the life Cycle of Data?
20. Discuss the demand for Strtegic Information and data Warehousing.
UNIT – IV
1. i. What are the characteristics of data warehouse architecture? (Feb 12)
ii. Explain about Data Mart Issuses and Building data Mart Issuses?
2. Explain star, snow flake, fact constellationn and give their prons and cons. (Feb 12)
3. i. Discuss FP-Tree growth algorithm for discovering association rules? (Aug 11)
ii. Give a short example to show that items in a strong association rule may actually be negatively .correlated?
4. What is clustering? Describe various clustering based approach briefly. Give one example for each one.
(Aug 11)
5. i. Explain about Data Mart Issuses and Building Data marts? (July 11)
ii. Describe the architecture of a Data Warehouse.
6. i. Briefly explain the characteristics of a Dimension Table? (July 11)
ii Discuss about Fact Tables and Cyclicity of Data.
7. Explain the architecture of a data warehouse.
8. Compare two-tier and three-tier architecture of a data warehouse.
9. Explain how a data warehouse differs from data mart.
10. Compare top-down approach with bottom-up approach of building data marts. Explain the practical
approach.
11. Explain dimensional modeling.
12. Write a short note on star schema. Explain it with a relevant example. Mention some of its advantages and
shortcomings.
13. i. List the features of a fact table and a dimension table.
ii. Write a short note on fact less fact table. Draw a star schema representing a fact less fact table of a patient
visiting a hospital.
14. Differentiate between fully-additive measures, semi-additive measures and non-additive measures with
example.
15. Explain the snowflake schema with its advantages and disadvantages. Also make a comparison between
the star schema and snowflake schema.
16. What are aggregate fact tables? Write a short note on snapshot and transaction tables.
UNIT - V
1. i. What is a Fact Table? What are the characteristics of a Fact Table? (Feb 2012)
ii. Explain the Factless Table with an example. (Feb 12)
2. Explain ETL Process (Feb 12)
3. i. What are the advantages and disadvantages of Decision Tree Classification? (Aug 11)
ii. Explain Best Split, splitting indices, splitting criteria for decision tree classification. Issustrate with an
example.
4. i. Whate is neural networks? Explain the unsupervised learning. (Aug 11)
ii. How do you handle spatial and non-spatial data, while carrying out any mining task?
5. i. Write short notes on performance enhancement using DWSchema. (July 11)
ii. Describe the ETL Process.
6. i. Discuss the MultiDimensional Analysis, Functions and applications of OLAP. (July 11)
ii. Explain about tools, Products and Data Design of OLAP in Data Warehousing.
7. List some activities that are a part of the ETL process. List the steps followed in the ETL process.
8. Write a short note on the different data extraction techniques.
9. Explain the merits of having quality data in your data warehouse. Discuss some of the side effects of
having poor quality data.
10. Write a short note on data quality tools. How will you categorize these tools? Explain.
11. Define the terms: Initial load, incremental load, full refresh and update.
12. List the functions that form a part of data transformation process.
13. Write a short note on multidimensional analysis.
14. Define an OLAP system. Also write down some of its essential characteristics.
15. Write a short note on OLAP tools and products.
16. Explain the issues in OLAP administration.
17. Briefly explain the different models of OLAP. Differentiate ROLAP and MOLAP

No comments:

Post a Comment