Log in to StudySoup
Get Full Access to GSU - CIS 2010 - Class Notes - Week 5
Join StudySoup for FREE
Get Full Access to GSU - CIS 2010 - Class Notes - Week 5

Already have an account? Login here
Reset your password

GSU / Science / CIS 2010 / The data dictionary provides what?

The data dictionary provides what?

The data dictionary provides what?


School: Georgia State University
Department: Science
Course: Introduction to Information Systems
Professor: Jim senn
Term: Fall 2016
Cost: 25
Name: CIS 2010 Chapter5
Description: Chapter 5 notes
Uploaded: 09/23/2016
8 Pages 44 Views 2 Unlocks

Chapter 5. Data and knowledge Management 

The data dictionary provides what?

Managing data 

The difficulties of managing data: 

1) The amount of data increases exponentially with time, data  scattered throughout organizations, and collected by many individuals using  various methods and devices.

2) Data are generated from multiple sources:

∙ Internal (company documents);

∙ Personal (opinions, experiences);

∙ External (government reports)

Data come from the Web, in clickstream data- that visitors and customers  produce when they visit a Web site and click on hyperlinks (web-purchase).

3) New sources data (blogs, podcasts) are constantly being  developed and data these technologies generate must be managed. Data  degrade over time (customer change name or address).

What is the tactic knowledge?

If you want to learn more check out What are the characteristics of fraudulent papers?
If you want to learn more check out What is the study of psychopharmacology?

4) It is subject to data rot, refers to primarily to problems with the  media on which it stored.

5) Easily jeopardized

6) Org have developed info system for specific business process  (transaction process)

7) Complicate data management:

∙ Federal regulations;

∙ Companies are drowning in data.

Data Governance 

It is can approach to managing inform across an entire organization. Policies  designed to ensure that data handled in a certain, well-defined fashion.

Master data management- process that spans all org business process and  applications. Ability to store, maintain, exchange.

Master data – set of core data (customer, product, and vendor) that span the  enterprise IS. Applied to multiple transactions and are used to categorize,  aggregate, and evaluate the transaction data.

What is the microsegmentation of customers?

Don't forget about the age old question of What is the barrier to the rotation of ethane?

Transaction data- generated and captured by operational system, describe  business activities, or transactions.

Database Approach 

Each application required its own data, which were organized in data file. Data file – collection of logically related records. This file contains all of the data records  the application requires.

Database system minimize:

∙ Redundancy: same data, multiple locations;

∙ Isolation: app cannot access data associated with other app ∙ Inconsistency: various copies of the data don’t agree


∙ Security: databases have extremely high security measures (minimize and deter attacks) to decrease risk of losing data ∙ Integrity: meet certain constants ( no alphabetic characters in SSN) If you want to learn more check out What is salvage ethnography?
If you want to learn more check out What is angiosperm reproduction?

∙ Independence: app and data not linked to each other.

Data Hierarchy 

Bits (binary digits) – the smallest unit data a computer can process  (consists of 1 and 0). If you want to learn more check out What is the energy of interactions?

 Byte ( group of 8 bits)- represent a single character, letter, number, or  symbol.

Field – logical grouping of characters into a word, small group of words. Logical grouping of related fields – records (the courses taken, the  date).

Data files – logical grouping of related records (table).

Database- grouping of related files.

 Database management system. 

 DBMS- set of programs that provide users with tools to create and  manage a database. Provide the mechanisms for maintaining the integrity of  stored data, managing security and user access, recovering info, if system  fails.

 The relational database model – based on the concept of two dimensional tables. Consists of flat file- all records and attributes. Designing an effective database – data model, diagram that represent  entities in the database and their relationships.

Entities – person, place, thing, or event about which info maintained. Instance- an entity refers to each row in a relational table, which is  unique representation of the entity.  

Attribute- each characteristic or quality of the particular entity.  Primary key (attribute) – every record in the database must contain at  least one field that uniquely identifies that records, so it can be retrieved,  updated and sorted.  

Secondary key- another field that has some identifying information, but  doesn’t identify the record with complete accuracy. (Student major)   Foreign key – field (or group of fields) in one table that uniquely  identifies a row of another table (establish and enforce a link between two  tables).

Big Data 

Bid data- a collection of data so large and complex that it is difficult to  manage using traditional database management system. It’s about

predictions, came from applying math to huge quantities of data to infer  probabilities.

 Defining Big Data 

1) The technology research firm Garter: Big Data – diverse, high volume, high-velocity information assets that require new forms of processing to enable enhanced decision making, insight discovery and process  optimization.

2) The Big Data Institute: Big Data exhibit variety, include  structured, unstructured, semi structured data.

By 2015, the amount of stored information in the world was over 98%  and less than 2% no digital.

∙ Generated at high velocity with an uncertain pattern. ∙ Don’t fit neatly into traditional, structured, relational databases.

 Big Data consists: 

1) Traditional enterprise data ( Web store transactions); 2) Machine- generated/ sensor data ( manufacturing data); 3) Social data ( customers feedback) (comments, social media); 4) Images captured by billions of devices located in the world.

Characteristics of Big Data: 

∙ Volume

∙ Velocity : the rate at which data flow into an organization is  rapidly increasing;

∙ Variety: traditional data formats tend to be structured and  relatively well described, and they change slowly (financial market data).

Issues with Big Data 

∙ Can come from untrusted source: internal and external to the  organization (e-mail, call center notes);

∙ It is dirty: inaccurate, incomplete, duplicate or erroneous data  (misspelling of words)

∙ Its changes, especially in data streams: organizations must be  aware that data quality in an analysis can change, or the data itself can  change, because conditions under which the data are captured can change.  Managing Big Data 

BD makes it possible to do many things that were previously  impossible (prevent disease).

1) Integrate information silos into a database environment and  develop data warehouses for decisions making.

2) Business of information management – making sense of their  proliferating data.

Many organizations employ NoSQL – database to process BD (not only  structured query language). Manipulate structured as well as unstructured data and  inconsistent or missing data.

Putting Data to Use 

∙ Making BD available – for relevant stockholders can help org  gain value (open data in the public sector). Can be used to create new  business and solve complex problems.

∙ Enabling org to conduct experiments – offering different “looks”  of the Web site page.

∙ Microsegmentation of customers – dividing them into groups  that share one or more characteristics.

∙ Creating new business model – use sensors to collect data on  vehicle usage and improve the driving.

∙ Organizations can analyze more data- they don’t have to rely as  much on sampling.  

Big Data in the functional areas of the org 

1) Human resources: it recognizes that people different skills to the table and that there is no one-size-fits-all person for any job.  

2) Products development: BD capture customer preferences and  put that information to work in designing new products.  

3) Operations (sensors that capture the truck’s speed and location) 4) Marketing: using data t better understanding the customer and  to target their marketing efforts more directly.

5) Government operations: record water level in rivers to prevent  flooding.  

Data warehouses and data marts 

Data warehouse- a repository of historical data that are organized by subject  to support decisions makers in the org.

Data mart- a low-cost, scaled-down version of a data warehouse that is  designed for the end-user needs in a strategic business unit (SBU) or an individual  departments.  


1. Organized by business dimension or subject (customer vendor)

Business dimension- data subject such as product, geographic  

area, time period that represent the edges of the data cube.

2. Use online analytical processing (OLTP) business transactions are processed online as soon as they occur. Speed and efficiency.

Online analytical processing (to support decision makers) involves the analysis of accumulated data by end users.

3. Integrated: data collect from multiple system and then  

integrated around subject.

4. Time variant: warehouses and marts maintain historical data  (time as a variable). Stores years of data.

5. Nonvolatile: users cannot change or update the data;

6. Multidimensional structure: common representation is the data  cube.

A generic DW environment 

1) Source systematic that provide data to the warehouse or mart-  “organizational pain” that motivates a firm to develop its IB capabilities;

2) Data- integration technology and processes that prepare the  data for use- extract data, transform them, then load into a data mart or  warehouse – ETL (data integration);

3) Different architectures for storing data- central enterprise data  (stored in warehouses and accessed by all users and represent the single  version of the truth);

4) Different tools and apps for the variety of users;

5) Metadata, data-quality governance processes that ensure that  warehouses and marts meets its purposes. Metadata- the data about the  data.  

Limitations of data warehouses:

∙ Can be very expensive to build and maintain;

∙ Incorporating data from obsolete mainframe system can be  difficult and expensive.

∙ People can share data with other departments.  

 Knowledge Management: 

Knowledge management- a process that helps organizations manipulate  important knowledge that comprises part of the organizations memory.

Intellectual capital (knowledge) – information that’s contextual, relevant, and  useful. Can be utilized to solve a problems.

Explicit knowledge deals with more objective, rational and technical  knowledge. Consist of the policies, procedural guides, reports. It is the knowledge  that has been codified in a form that can be distributed to others or transformed  into a process or a strategy.

Tactic knowledge – the cumulative store of subjective or experiential learning.  Consists of an organization’s experiences, insights, expertise, and culture. It’s  imprecise and costly to transfer, highly personal, difficult to formalize or codify.

Knowledge management system (KMSs) refer to the use of modern  information technologies to systemize, enhance and expedite intrafirm and interfirm knowledge management. Help to make the most productive use of the knowledge.  

Benefit- the best practices- the most effective and efficient ways of doing  things- available to a wide range of employees.

The KMS Cycle: 

1) Create knowledge;

2) Capture: must be identified as valuable, used in reasonable way; 3) Refine: placed in context, so its actionable;

4) Store: stored in reasonable format;

5) Manage: must be kept current;

6) Disseminate: available in useful format.

Fundamentals of Relational database operations 

Query languages 

SQL- the most popular query language used for interacting with a  database. Allow to perform complicated searches by using relatively simple  statements or key words.

 SELECT- to choose desired attribute;

 FROM- to specify the table to be used;

 WHERE- to specify conditions to apply in the query

QBE (query by example) – users fills out a grid or template (form) to  construct a sample or a description of the data designed.

Entity- Relationship Modeling (ER) 

ER- consists of entities, attributes, and relationships and used with  business rules to properly identify them. ER allows to communicate with  users throughout the organization to ensure that all entities and the  relationships among entities are represented.

Business rules – precise descriptions of policies, procedures, or  principles in any organization that stores and uses data to generate  information.

The data dictionary- provides information on each attribute, such as  name, if it is a key, part of a key, or non-key attribute, the type of data  expected and valid values.

Relationships illustrate an association between entities. Degree of a  relationship – the number of entities associated with a relationship. A unitary relationship- an association is maintained within a  single entity.

A binary relationship- two entities are associated.

A ternary relationship- three entities are associated.

Connectivity- the relationship classification.

Cardinality- the maximum number of times an instance of an entity can  be associated with an instance in the related entity.

∙ Connectivity and cardinality –established by the business rules  of a relationship.

Cardinality symbols: 

∙ Mandatory single

∙ Optional single  

∙ Mandatory many

∙ Optional many

Entities have attributes, or properties, that describe the entity’s  characteristics.  

Three types of binary relationships: 

One-to-one (1:1) a single-entity instance of one type is related  to a single-entity instance of another type. (ex. Student-parking  permit)

One-to-many (1:M) represented by the class- professor  


Many-to-many (M:M) represented by the student-class  

relationship. Therefore, junction (bridge) tables uses so there are two  one-to-many relationship.

Normalization and Joins 

Normalization- a method for analyzing and reducing a relational database to its most streamlined form to ensure minimum redundancy, maximum data integrity, and optimal processing performance.

Functional dependencies- means of expressing that the value  of one particular attribute is associated with a specific single value of  another attribute.

Join operation combines records from two or more tables in a  database to obtain information that is located in different tables.

Page Expired
It looks like your free minutes have expired! Lucky for you we have all the content you need, just sign up here