Sunday 29 May 2011

Week Seven B.I.T

Week Seven - Databases and Data Warehouses




1. List, describe, and provide an example of each of the five characteristics of high quality information.

High quality information is essential to any decision making process within an organization. The qualities of value information include:

Completeness: if data is incomplete then it's useless since it wouldn't represent an accurate whole image of reality.

Consistency: crutial part of the database approach as same data will be shared through the different departments.

Accuracy: preventing data entry errors with strictly entering accurate data.

Uniqueness: data has to be unique and pass the data integrity test.
Timeliness: data has to be updated to consistently be reliable

2. Define the relationship between a database and a database management system.

A special data processing system, or part of a data processing system, which aids in the storage, manipulation, reporting, management, and control of data. Abbreviated DBMS

System for quick search and retrieval of information from a database. The DBMS determines how data are stored and retrieved. It must address problems such as security, accuracy, consistency among different records, response time, and memory requirements. These issues are most significant for database systems on computer networks. Ever-higher processing speeds are required for efficient database management. Relational DBMSs, in which data are organized into a series of tables ("relations") that are easily reorganized for accessing data in different ways, are the most widely used today.

A collection of interrelated data together with a set of programs to access the data, also called database system, or simply database. The primary goal of such a system is to provide an environment that is both convenient and efficient to use in retrieving and storing information.
A database management system (DBMS) is designed to manage a large body of information. Data management involves both defining structures for storing information and providing mechanisms for manipulating the information. In addition, the database system must provide for the safety of the stored information, despite system crashes or attempts at unauthorized access. If data are to be shared among several users, the system must avoid possible anomalous results due to multiple users concurrently accessing the same data.
Examples of the use of database systems include airline reservation systems, company payroll and employee information systems, banking systems, credit card processing systems, and sales and order tracking systems.

3. Describe the advantages an organisation can gain by using a database
A "Business Intelligence" database is specifically designed to enhance management's ability to understand their business operations and make good business decisions. It is imperative to have an in depth knowledge about factors such as your customers, competitors, business partners, economic environment, and internal operations to make effective and good quality business decisions. 

so a business must : 

* Avoid Errors
* Visualize data

* Manipulate data


1. Avoid Errors


There's only a single storage location for any piece of information, so:-

- data updates are simple with no need to change same info in 15 different files

- info is always up-to-date - there's no chance of old data remaining in some forgotten files

Error check data as they're entered

data are validated on entry, to filter impossible values
data can be cross-checked against existing information (no more pregnant males!) 

2. visualize data:

Reveal new feature of data:



customized database can reveal new areas for investigation by presenting large quantities of data in an intuitive form.
Often patterns only become apparent by 'flicking through' the data. With a customised database this is possible in new ways: data from many different sources can be brought together and displayed quickly and easily.


3. Manipulation of data

combine different datasets easily and efficiently

With a database you can combine different data easily. No more error-prone cut-and-paste to force the data into a different format. Do away with all the intermediate Excel spreadsheets that are often generated when manipulating data. (And which you often daren't delete just in case they contain important information.)

Data manipulation is performed using Queries. These use Structured Query Language, SQL, to combine, update and manage data, but often the user need not know any SQL to perform complex tasks. In Microsoft Access, for example, the visual Query Builder provides an intuitive interface that makes most queries simple.
The following video is of effective data base use within a successful business.

4. Define the fundamental concepts of the relational database model.
The relational paradigm introduced the concept of “related by [logical] key” as opposed to “related by physical record id”, and carried over the heavy reliance on indexes for searching for the keys. It was one major step forward in databases, with great advantages, but much slower than the preceding alternatives. Therefore, it is not the data being related (they were before the relational paradigm) that identifies a relational database, it is whether the data is related by logical key. It was really the difference between physical and logical, and developers no longer need to concern themselves with the physical key (record id). The record or row id still exists, but it is entirely hidden.

By the definitions set out by those great minds, for a “database” (normalised or not) to be relational:
1 the rows in a Parent tables must have one column which is the logical Primary Key (unique, not null)
2 the rows in a Child tables must have a Foreign Key to the Parent table, which is the PK in the Parent, which forms the logical relation between the Parent (rows, table) and the Child (rows, table).
This allows the tables to be rebuilt (without changing the logical keys), without affecting the integrity of the relations.

From your descriptions, the contents of your Progress system, by definition, as per the title of this thread and your initial post, is neither a database nor relational.

There are further levels of defining relational and databases, for which there are standards (eg. IDEF1X, which is used by the US Military and DoD), which some may say require compliance to, before the object can be defined as “relational” or “database”. I am limiting myself (1) & (2) to the absolute minimum standards set out by Codd & Date, which is now (twenty plus years on) well established.


5. Describe the benefits of a data-driven website.

A data-driven website is a site that can easily and quickly be updated by its managers to display requested information to the website user in the most effective way.  Conversely, a static website offers its users information that is rarely updated, but a data-driven website will constantly be updated with more recent and accurate information
The benefits of a data-driven website are numerous.  The first  benefit is that changing the content of the website can be done without knowledge or expertise. Managing the website can be done with little training.  The website administrator does not need to know HTML or programming in order to make any changes and updating a data-driven websites only takes a couple of clicks all in a few seconds.  The second benefit is the level of speed when the website manager makes changes.  When hosting a data-driven website, changing the content is done almost in real-time.  Thirdly, data-driven websites inherently have a great deal of scalability.  To this end, expanding a website is very simple which leaves plenty of room for growth

6. Describe the roles and purposes of data warehouses and data marts in an organisation


Enterprise Data Warehouse: The enterprise data warehouse is typically a large organization-wide database repository that crosses over every business function and includes data from every organizational unit, division, and department.  In essence, an enterprise data warehouse is a substantially large repository of historical and current transaction data of an entire organization. 
Data Mart: A data mart is a collection of subject areas organized for decision support based on the needs of a given department or office.  Many times, a data mart will serve as the reporting and analytical solution for a particular department within an organization, such as accounting, sales, customer service, and/or marketing.  
There are two kinds of data marts that exist — dependent and independent: 
• A dependent data mart is one whose source is another data warehouse, and all dependent data marts within an organization are typically fed by the same source — the enterprise data warehouse.
• An independent data mart is one whose source is directly from transactional systems, legacy applications, or external data feeds