CIS 2010 Chapter5
Popular in Intro Computer Based Information Systems
Popular in Department
This 8 page Class Notes was uploaded by Daria Trikolenko on Thursday September 22, 2016. The Class Notes belongs to CIS 2010 at Georgia State University taught by Jim Senn in Fall 2016. Since its upload, it has received 92 views.
Reviews for CIS 2010 Chapter5
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 09/22/16
Chapter 5. Data and knowledge Management Managing data The difficulties of managing data: 1) The amount of data increases exponentially with time, data scattered throughout organizations, and collected by many individuals using various methods and devices. 2) Data are generated from multiple sources: Internal (company documents); Personal (opinions, experiences); External (government reports) Data come from the Web, in clickstream data- that visitors and customers produce when they visit a Web site and click on hyperlinks (web-purchase). 3) New sources data (blogs, podcasts) are constantly being developed and data these technologies generate must be managed. Data degrade over time (customer change name or address). 4) It is subject to data rot, refers to primarily to problems with the media on which it stored. 5) Easily jeopardized 6) Org have developed info system for specific business process (transaction process) 7) Complicate data management: Federal regulations; Companies are drowning in data. Data Governance It is can approach to managing inform across an entire organization. Policies designed to ensure that data handled in a certain, well-defined fashion. Master data management- process that spans all org business process and applications. Ability to store, maintain, exchange. Master data – set of core data (customer, product, and vendor) that span the enterprise IS. Applied to multiple transactions and are used to categorize, aggregate, and evaluate the transaction data. Transaction data- generated and captured by operational system, describe business activities, or transactions. Database Approach Each application required its own data, which were organized in data file. Data file – collection of logically related records. This file contains all of the data records the application requires. Database system minimize: Redundancy: same data, multiple locations; Isolation: app cannot access data associated with other app Inconsistency: various copies of the data don’t agree Maximize: Security: databases have extremely high security measures (minimize and deter attacks) to decrease risk of losing data Integrity: meet certain constants ( no alphabetic characters in SSN) Independence: app and data not linked to each other. Data Hierarchy Bits (binary digits) – the smallest unit data a computer can process (consists of 1 and 0). Byte ( group of 8 bits)- represent a single character, letter, number, or symbol. Field – logical grouping of characters into a word, small group of words. Logical grouping of related fields – records (the courses taken, the date). Data files – logical grouping of related records (table). Database- grouping of related files. Database management system. DBMS- set of programs that provide users with tools to create and manage a database. Provide the mechanisms for maintaining the integrity of stored data, managing security and user access, recovering info, if system fails. The relational database model – based on the concept of two- dimensional tables. Consists of flat file- all records and attributes. Designing an effective database – data model, diagram that represent entities in the database and their relationships. Entities – person, place, thing, or event about which info maintained. Instance- an entity refers to each row in a relational table, which is unique representation of the entity. Attribute- each characteristic or quality of the particular entity. Primary key (attribute) – every record in the database must contain at least one field that uniquely identifies that records, so it can be retrieved, updated and sorted. Secondary key- another field that has some identifying information, but doesn’t identify the record with complete accuracy. (Student major) Foreign key – field (or group of fields) in one table that uniquely identifies a row of another table (establish and enforce a link between two tables). Big Data Bid data- a collection of data so large and complex that it is difficult to manage using traditional database management system. It’s about predictions, came from applying math to huge quantities of data to infer probabilities. Defining Big Data 1) The technology research firm Garter: Big Data – diverse, high- volume, high-velocity information assets that require new forms of processing to enable enhanced decision making, insight discovery and process optimization. 2) The Big Data Institute: Big Data exhibit variety, include structured, unstructured, semi structured data. By 2015, the amount of stored information in the world was over 98% and less than 2% no digital. Generated at high velocity with an uncertain pattern. Don’t fit neatly into traditional, structured, relational databases. Big Data consists: 1) Traditional enterprise data ( Web store transactions); 2) Machine- generated/ sensor data ( manufacturing data); 3) Social data ( customers feedback) (comments, social media); 4) Images captured by billions of devices located in the world. Characteristics of Big Data: Volume Velocity : the rate at which data flow into an organization is rapidly increasing; Variety: traditional data formats tend to be structured and relatively well described, and they change slowly (financial market data). Issues with Big Data Can come from untrusted source: internal and external to the organization (e-mail, call center notes); It is dirty: inaccurate, incomplete, duplicate or erroneous data (misspelling of words) Its changes, especially in data streams: organizations must be aware that data quality in an analysis can change, or the data itself can change, because conditions under which the data are captured can change. Managing Big Data BD makes it possible to do many things that were previously impossible (prevent disease). 1) Integrate information silos into a database environment and develop data warehouses for decisions making. 2) Business of information management – making sense of their proliferating data. Many organizations employ NoSQL – database to process BD (not only structured query language). Manipulate structured as well as unstructured data and inconsistent or missing data. Putting Data to Use Making BD available – for relevant stockholders can help org gain value (open data in the public sector). Can be used to create new business and solve complex problems. Enabling org to conduct experiments – offering different “looks” of the Web site page. Microsegmentation of customers – dividing them into groups that share one or more characteristics. Creating new business model – use sensors to collect data on vehicle usage and improve the driving. Organizations can analyze more data- they don’t have to rely as much on sampling. Big Data in the functional areas of the org 1) Human resources: it recognizes that people different skills to the table and that there is no one-size-fits-all person for any job. 2) Products development: BD capture customer preferences and put that information to work in designing new products. 3) Operations (sensors that capture the truck’s speed and location) 4) Marketing: using data t better understanding the customer and to target their marketing efforts more directly. 5) Government operations: record water level in rivers to prevent flooding. Data warehouses and data marts Data warehouse- a repository of historical data that are organized by subject to support decisions makers in the org. Data mart- a low-cost, scaled-down version of a data warehouse that is designed for the end-user needs in a strategic business unit (SBU) or an individual departments. Characteristics: 1. Organized by business dimension or subject (customer vendor) Business dimension- data subject such as product, geographic area, time period that represent the edges of the data cube. 2. Use online analytical processing (OLTP) business transactions are processed online as soon as they occur. Speed and efficiency. Online analytical processing (to support decision makers) involves the analysis of accumulated data by end users. 3. Integrated: data collect from multiple system and then integrated around subject. 4. Time variant: warehouses and marts maintain historical data (time as a variable). Stores years of data. 5. Nonvolatile: users cannot change or update the data; 6. Multidimensional structure: common representation is the data cube. A generic DW environment 1) Source systematic that provide data to the warehouse or mart- “organizational pain” that motivates a firm to develop its IB capabilities; 2) Data- integration technology and processes that prepare the data for use- extract data, transform them, then load into a data mart or warehouse – ETL (data integration); 3) Different architectures for storing data- central enterprise data (stored in warehouses and accessed by all users and represent the single version of the truth); 4) Different tools and apps for the variety of users; 5) Metadata, data-quality governance processes that ensure that warehouses and marts meets its purposes. Metadata- the data about the data. Limitations of data warehouses: Can be very expensive to build and maintain; Incorporating data from obsolete mainframe system can be difficult and expensive. People can share data with other departments. Knowledge Management: Knowledge management- a process that helps organizations manipulate important knowledge that comprises part of the organizations memory. Intellectual capital (knowledge) – information that’s contextual, relevant, and useful. Can be utilized to solve a problems. Explicit knowledge deals with more objective, rational and technical knowledge. Consist of the policies, procedural guides, reports. It is the knowledge that has been codified in a form that can be distributed to others or transformed into a process or a strategy. Tactic knowledge – the cumulative store of subjective or experiential learning. Consists of an organization’s experiences, insights, expertise, and culture. It’s imprecise and costly to transfer, highly personal, difficult to formalize or codify. Knowledge management system (KMSs) refer to the use of modern information technologies to systemize, enhance and expedite intrafirm and interfirm knowledge management. Help to make the most productive use of the knowledge. Benefit- the best practices- the most effective and efficient ways of doing things- available to a wide range of employees. The KMS Cycle: 1) Create knowledge; 2) Capture: must be identified as valuable, used in reasonable way; 3) Refine: placed in context, so its actionable; 4) Store: stored in reasonable format; 5) Manage: must be kept current; 6) Disseminate: available in useful format. Fundamentals of Relational database operations Query languages SQL- the most popular query language used for interacting with a database. Allow to perform complicated searches by using relatively simple statements or key words. SELECT- to choose desired attribute; FROM- to specify the table to be used; WHERE- to specify conditions to apply in the query QBE (query by example) – users fills out a grid or template (form) to construct a sample or a description of the data designed. Entity- Relationship Modeling (ER) ER- consists of entities, attributes, and relationships and used with business rules to properly identify them. ER allows to communicate with users throughout the organization to ensure that all entities and the relationships among entities are represented. Business rules – precise descriptions of policies, procedures, or principles in any organization that stores and uses data to generate information. The data dictionary- provides information on each attribute, such as name, if it is a key, part of a key, or non-key attribute, the type of data expected and valid values. Relationships illustrate an association between entities. Degree of a relationship – the number of entities associated with a relationship. A unitary relationship- an association is maintained within a single entity. A binary relationship- two entities are associated. A ternary relationship- three entities are associated. Connectivity- the relationship classification. Cardinality- the maximum number of times an instance of an entity can be associated with an instance in the related entity. Connectivity and cardinality –established by the business rules of a relationship. Cardinality symbols: Mandatory single Optional single Mandatory many Optional many Entities have attributes, or properties, that describe the entity’s characteristics. Three types of binary relationships: One-to-one (1:1) a single-entity instance of one type is related to a single-entity instance of another type. (ex. Student-parking permit) One-to-many (1:M) represented by the class- professor relationship. Many-to-many (M:M) represented by the student-class relationship. Therefore, junction (bridge) tables uses so there are two one-to-many relationship. Normalization and Joins Normalization- a method for analyzing and reducing a relational database to its most streamlined form to ensure minimum redundancy, maximum data integrity, and optimal processing performance. Functional dependencies- means of expressing that the value of one particular attribute is associated with a specific single value of another attribute. Join operation combines records from two or more tables in a database to obtain information that is located in different tables.
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'