Week 12 Scavenger Hunt
Week 12 Scavenger Hunt ISM 3004
Popular in Computing in Business Environment
Popular in Business
verified elite notetaker
This 8 page Class Notes was uploaded by Ashby Strauch on Wednesday November 11, 2015. The Class Notes belongs to ISM 3004 at University of Florida taught by Dr. Olson in Fall 2015. Since its upload, it has received 232 views. For similar materials see Computing in Business Environment in Business at University of Florida.
Reviews for Week 12 Scavenger Hunt
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 11/11/15
Scavenger Hunt – Week 11 – The Data Asset Unit 1: Introduction 11.01-Introduction Why is data an asset? o Asset: useful or valuable thing, person or quality o Data is useful and valuable o Gives managers power that they normally wouldn’t have o Can be misused by someone outside the organization What is the rate of data growth? o “planet is awash with data” o Moore’s law o Technology just keeps getting cheaper o Gartner Symposium Doubling every 6 months on corporate hard drives Data growth is unprecedented and will continue to grow at incredible rates Growing demand for data because managers know the power of data What is “information overload”? o Data is disruptive, not the usual way of 10+ years ago, can bring treasure or tumult Information overload $900 billion cost to economy By contrast, what is “information abundance” and what are its implications for knowledge workers? The world has changed, jobs have change, people have not caught up to the change “Geek UP!” Dark data o Many of organizations there is plenty of data it is just dormant and spread out where data can’t be turned into anything of value o Many firms have been shocked by amount of work that it takes to pull together an infrastructure Complex task to make incompatible sources together to make it trustworthy Data-driven, fact-based decision making can change everything! Gallaugher: “Firms that are basing decisions on hunches aren’t managing; they’re gambling. And today’s markets have no tolerance for uninformed managerial dice rolling.” Business Intelligence o Competitive Advantage o Hunchs are bad, today’s markets don’t have mercy for that o Make decisions using Data! Analytics 11.02-Data, Information and Knowledge Data o Raw facts and figures o Tells you NOTHING ALONE o Need to take data and turn it into information o Data integrity is key: got to be clean o Have to be able to understand your data Data Schema: way the data is organized, structure, meaning of various elements If you understand that you will be able to understand data Information o More valuable than data o Data presented in a context so it can answer a question or support decision making Knowledge o Insight derived from experience and expertise o Information and manger’s insight/knowledge = best results Characteristics of Structured Data and Unstructured Data o Structured data Organized Predefined characteristics—rules for date, numbers, format, etc. Schema o Unstructured Data Not organized no schema Text—email, Facebook pages, news stories, etc. Binary—images, audio, video Unit 2: Managing Data 11.03-Database Management Systems Table o Organized collection of data o Made up of record and fields Record is a row Individual observation Fields are columns An attribute about which there is data for each observation Ex: addresses Predetermined by data base tables schema There is waste Repeating data in database Ex: have to write customer address, phone number again in database Potential for error If you can’t associate people together bc you enter data wrong, you can’t get information out of the data Data needs to be clean and consistent Relational Database: real power when you can correlate data from multiple tables, linking things together through what they have together o Multiple tables related together through using common element o Benefits of using relational databases o Key Field One of columns in a field Data items in that row are unique, never repeat Use key fields to create relationships between other tables Field must be found in two different tables Enhances integrity o Valid relationship types One:One One:Many Many:Many – Not allowed, no way to clearly identify who on left belongs to who on the right, can’t match duplicates on two sides o Views SQL: Structured Query Language o SELECT fields FROM table o SELECT * (all) FROM Customers o SELECT * FROM Customers WHERE State=”FL” Leading DBMS: Desktop and Server-based o Desktop: Microsoft Access o Servers: Oracle, MySQL, Microsoft SQL Server 11.04-The Origin of Data Data can be internal (from within the company) or external (bought from outside the company) TPS: Transaction Processing System: get data internally generated o What is a “transaction” and what are its two key characteristics? Any business exchange Standardized (schema) Occurs repeatedly Ex: ATM Point of Sale system: point of Sale Retail sales When people scan the stuff you buy DATA, can figure out what is selling, see trends, ID cycles Cash: anonymous o No data about behavior of individual customers o How do loyalty cards generate valuable data? Membership program Company pays you through bonuses for the data you give them Now know what was sold to whom Opportunities for targeted marketing o Websites: transactions Use transaction data to find out what parts of website are hot, how did they get to that part, etc What products did you look at, add to cart, etc. Use that to bring value to you o Search engines Searches generate transactions Know what you searched for, can be valuble, geographic location, can get picture about what is going on Enterprise software: CRM, ERP, SCM o CRM Every sales call, inquiry from customers, all data o ERP Paychecks, invoices, payments Business transactions to bring into collection of data to seek insights o SCM Each order for raw materials, finished goods, etc. o As we increase enterprise software, the data becomes more valuable Business operations – examples o Health care: patient data o Michigan: tags cows at birth for stream of data o Transportation: censors on airplanes Trains have censors too Sources of customer-provided data o Customer surveys o Product registration cards Data from customers o Contests Freebie for data from customer External sources o General Information Weather News stories—relevant data Public records o Data Aggregator Company who collects data from wide variety of sources Product that they sell Acxiom Sells data it has collected about Americans Privacy Regulation (text only) o What is the impact of Moore’s Law and the Internet on privacy? Some feel that Moore’s Law, the falling cost of storage, and the increasing reach of the Internet have us on the cusp of a privacy train wreck. And that may inevitably lead to more legislation that restricts data-use possibilities. Noting this, strategists and technologists need to be fully aware of the legal environment their systems face and consider how such environments may change in the future. Many industries have strict guidelines on what kind of information can be collected and shared. o According to a Carnegie Mellon study, for 87% of Americans, what can be determined if you know their gender, birth date and zip code? And while targeting is getting easier, a Carnegie Mellon study showed that it doesn’t take much to find someone with a minimum of data. Simply by knowing gender, birth date, and postal zip code, percent of people in the United States could be pinpointed by name. o What is HIPAA? What kind of data does it protect? For example, HIPAA (the U.S. Health Insurance Portability and Accountability Act) includes provisions governing data use and privacy among health care providers, insurers, and employers. The financial industry has strict requirements for recording and sharing communications between firm and client (among many other restrictions). There are laws limiting the kinds of information that can be gathered on younger Web surfers. And there are several laws operating at the state level as well. 11.05-Riding the Data Tsunami According to Gartner Research, top CIOs say that Data storage growth is the #1 challenge today. o What two problems arise from that challenge? Handling explosive growth with constrained budgets Exploiting all that data—present it to managers so they can understand and use data What is an SSD? How does it address these problems? o Solid State Drive o Don’t have to wait to read data o Faster than magnetic hard drives Latency Throughput o Lower power consumption Less heat o RAID Link many together, share workload o Prices dropping What is Automated Data Tiering? How does it address these problems? o Match storage performance to access frequency o Current working data—top tier storage and cost, SSDs o Recently used data--mid tier storage, hard drives o Historical—bottom tier storage, tape What is DeDupe? How does it address these problems? o Same data repeatedly stored, unstructured data o Tame growth in unstructured data o Identify where there is duplicates o Single Storage for any data o Multiple instances are “pointers” to a single copy Unit 3: Problems with Data 11.06-Trouble in Paradise What are “data silos”? How do they come into being? Why is this a problem? o No sharing possible o Data collections are completely separated with no possibility of communication between the silos o Obsolete systems that are now incompatibly Missed opportunities for insights to answer questions and make decisions How do inconsistent data formats impact a business? o Stored in different databases differently o Ex: a US Bank system having difficulty telling whether person was boy or girl bc 36 sources of data coming together for analysis 17 different ways ex M or F m or f male or female etc. What is operational data? o A lot of great data in operating system o Great value in operational data o TPS must… Respond rapidly If it runs slow, customers will leave Respond consistently How does the analysis of operational data compete with customers? What can a company do about this problem? o Analysis competes with customers If it is running during business hours it could take away from business by making programs run slow What is a Data Warehouse? What are its characteristics? o Separate data repositories Operational Reporting and analytics Combine data from many sources—cleaning it Historical data Periodic import from operational systems o Collection of databases that supports decision making Many sources Operational systems—periodic transfer Historical data Fast queries exploration How is a Data Mart different from a Data Warehouse? o Specific problem o Specific unit o Trying to address specific needs Suggestions o Clear set of objectives and goals o Get leadership buy-in o Establish data governance Manage data from creation retirement Multi-faceted Need to formalize and document rules Unit 4: Big Data 11.07-Big Data What three characteristics are necessary for something to be “Big Data”? (three V’s) o Explain what each means o “Dataficiation of Everything” Facebook—500 million data items/minute Once those thoughts would have just disappeared Things Censors on things o Lifetime of data for everything now it is being captured Activities o RFID o Sunpass o Surveillance cameras o Consumer devices like Nike run Data growth most is unstructured o Big Data: Volume Too big to handle with traditional tools o Velocity rapid arrival can’t react fast enough feedback loop how rapidly we can get it in system, process it and make decisions based on it o Variety Too little consistency A lot of different formats for big data Text, images, sound, video, sensors, etc. What is Hadoop? o Technically… OSS designed to consume any data you want—structures or unstructured Can set up thousands of machines up to analyze big data Highly scalable Distributed computing platform o Pratically… Scalable Cost-effective Flexible Fault-tolerant o Components MapReduce Map: process input data in parallel Reduce: combine data from Map to create final results Go through each chunk of data, each server gets different set and find unique info and pass it to reduce which puts it together HDFS BROKEN into blocks for scalable size Replicates copies, scattered across server set Blocks distributed across servers Replicate data—failure assumed Pig Programming environment Pig Latin programming language like Java Runs on Pig machines Be familiar with the three examples of big data provided in the lecture. How do you see the Three V’s in each? o Predictive policing LA Took historical crime data and earthquake prediction models and applied info Predicted twice as many crimes as crime analysists Burglaries down 33% Violent crimes down 21% o Big Data is COOl Tesco grocery chain Optimize fridge costs In-store fridges provide data 70 million data points / store / year Energy costs down 20 million euros a year Proactive maintenance o Actions speak louder than words Improve therapy diagnoses Supporting US military in suicide prevention efforts 30 measures / second video analysis of facial expressions track gestures via Microsoft Kinect record vocal inflection, tone, silence pattern recognition—signs and types of psychological distress virtual therapist able to probe patients interactively and consistently Unit 5: Business Intelligence 11.08-Business Intelligence: stuff you and every manager should know Canned Reports – characteristics? How are they used? Pros/Cons? o Predefined format o Answer specific questions o Easy for users-pro o Inflexible-con o IT overhead -con Ad-Hoc Reporting Tools – characteristics? How are they used? Pros/Cons? o User define their own reports o Powerful/flexible o Demanding of user Potentially steep learning curve Business knowledge Understanding of data schema Dashboards – characteristics? How are they used? o Graphic view o Tells you what is happening within the system o Some customization OLAP – characteristics? How is it used? Pros/Cons? o Online analytical processing o Huge data o Pre-processed and summarized o User reports fast o No access to details Data Mining – what is it? How is it used? What warnings should a manager consider when doing data mining? o Enormous historical datasets o Identify patterns o Build models o Predict the future o WARNING: Need clean data, can give you false results with bogus results Bogus representatives
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'