Data warehousing :
Data warehousing is combining data from multiple and usually varied sources into one comprehensive and easily manipulated database. Common accessing systems of data warehousing include queries, analysis and reporting. Because data warehousing creates one database in the end, the number of sources can be anything you want it to be, provided that the system can handle the volume, of course. The final result, however, is homogeneous data, which can be more easily manipulated.
Data warehousing is comprised of two primary tools: databases and hardware. In a data warehouse, there are multiple databases and data tables used to store information. These tables are related to each other through the use of common information or keys. The size of a data warehouse is limited by the storage capacity of the hardware.
The hardware required for a data warehouse includes a server, hard drives, and processors. In most organizations, the data warehouse is accessible via the shared network or Intranet. A data architect usually is responsible for setting up the database structure and managing the process for the updating of data from the original sources.
A data warehouse is a repository of an organization's electronically stored data. It is designed to facilitate reporting and analysis.
- The purpose of data warehouse is to store data consistently across the organization and to make the organizational information accessible.
- It is adaptive and resilient source of information. When new data is added to the Data Warehouse, the existing data and technologies are not disrupted. The design of separate data marts that make up the data warehouse must be distributed and incremental. Anything else is a compromise.
- The data warehouse not only controls the access to the data, but gives its owners great visibility into the uses and abuses of the data, even after it has left the data warehouse.
- Data warehouse is the foundation for decision-making.;
Difference Between Data Warehousing And Business Intelligence:
Data warehousing and business intelligence are two terms that are a common source of confusion, both inside and outside of the information technology (IT) industry.
Data warehousing refers to the technology used to actually create a repository of data. Business intelligence refers to the tools and applications used in the analysis and interpretation of data.
Data warehousing and business intelligence have grown substantially and are forecast to experience continued growth into the future.
Different Types of Data Warehouse Design:
There are two main types of data warehouse design: top-down and bottom-up. The two designs have their own advantages and disadvantages.
Bottom-up is easier and cheaper to implement, but it is less complete, and data correlations are more sporadic.
In a top-down design, connections between data are obvious and well-established, but the data may be out of date, and the system is costly to implement.
Data marts are the central figure in data warehouse design. A data mart is a collection of data based around a single concept. Each data mart is a unique and complete subset of data. Each of these collections is completely correlated internally and often has connections to external data marts.
The way data marts are handled is the main difference between the two styles of data warehouse design. In the top-down design, data marts occur naturally as data is put into the system. In the bottom-up design, data marts are made directly and connected together to form the warehouse. While this may seem like a minor difference, it makes for a very different design.
The top-down method was the original data warehouse design. Using this method, all of the information the organization holds is put into the system. Each broad subject will have its own general area within the databases. As the data is used, connections will appear between correlative data points, and data marts will appear. In addition, any data in the system stays there forever—even if the data is superseded or trivialized by later information, it will stay in the system as a record of past events.
The bottom-up method of data warehouse design works from the opposite direction. A company puts in information as a standalone data mart. As time goes on, other data sets are added to the system, either as their own data mart or as part of one that already exists. When two data marts are considered connected enough, they merge together into a single unit.
The two data warehouse designs each have their own strong and weak points. The top-down method is a huge project for even smaller data sets. Since big projects are also more costly, it is the most expensive in terms of money and manpower. If the data warehouse is finished and maintained, it is a vast collection, containing everything that the company knows.
The bottom-up process is much faster and cheaper, but since the data is entered as needed, the database will never actually be complete. In addition, correlations between data marts are only as strong as their usage makes them. If a strong correlation exists, but no users see it, it goes unconnected.
Data Warehouse architecture: