Lecture Administration and visualization: Chapter 1 - Introduction to data management and visualization
Thông tin tài liệu:
Nội dung trích xuất từ tài liệu:
Lecture Administration and visualization: Chapter 1 - Introduction to data management and visualization Chapter 1 Introduction to datamanagement and visualizationHow big is big data? 3How big is big data? 4Data science: The 4th paradigm for scientificdiscovery 5Big data in 2008 6Big data sources• E-commerce• Social networks• Internet of things• Data-intensive experiments (bioinformatics, quantum physics, etc) 7Data is the new oil 8Big data 5V Big data is a term for data sets that are so large or complex that traditional data processing application software is inadequate to deal with them (wikipedia) 9Big data – big value 10 source: wipro.comIntroduction to data managementWhat is Data Management• Data management is the development and execution of architectures, policies, practices and procedures in order to manage the information lifecycle needs of an enterprise in an effective mannerPoor Data Management• 94% of companies suffering from a catastrophic data loss do not survive – 43% never reopen and 51% close within two years. (University of Texas)• 7 out of 10 small firms that experience a major data loss go out of business within a year. (DTI/Price Waterhouse Coopers)• 50% of all tape backups fail to restore. (Gartner)• 25% of all PC users suffer from data loss each year (Gartner)Why Data Management:Foundation to Advance Science • Data is a valuable asset – it is expensive and time consuming to collect • Data should be managed to: o maximize the effective use and value of data and information assets o continually improve the quality including data accuracy, integrity, integration, timeliness of data capture and presentation, relevance and usefulness o ensure appropriate use of data and information o facilitate data sharing o ensure sustainability and accessibility in long term for re-use in scienceA new image processing technique reveals something not before seen in this Hubble SpaceTelescope image taken 11 years ago: A faint planet (arrows), the outermost of three discoveredwith ground-based telescopes last year around the young star HR 8799.D. Lafrenière et al.,Astrophysical Journal Letters D. Lafrenière et al., ApJ Letters “Planet hidden in Hubble archives” Science News (Feb. 27, 2009)“The first thing it tells you is how valuable maintaining long-term archives can be.Here is a major discovery that’s been lurking in the data for about 10 years!”comments Matt Mountain, director of the Space Telescope Science Institute in Baltimore,which operates Hubble.“The second thing its tells you is having a well calibrated archive is necessary but notsufficient to make breakthroughs — it also takes a very innovative group of people todevelop very smart extraction routines that can get rid of all the artifacts to reveal theplanet hidden under all that telescope and detector structure.”Data Management Facilitates Sharingand Re-use…Where a majority of data end up now…Imagine if data were more accessible….Data Life Cycle Plan Analyze Collect Integrate Assure Discover Describe PreservePlanning• Consider data management before you collect data • What kind of data will be collected? • Which methods will be used (sensors, samples, etc.)? • What data formats/standards are appropriate? • How will the data be used? • How will you share the data? • Will your methods satisfy • Funding requirements • Policies for access, sharing, reuse • Budget – most of the time tihis is overlooked!• Output • Formal document
Tìm kiếm theo từ khóa liên quan:
Lecture Administration and visualization Administration and visualization Introduction to data management Big data in 2008 Big data sources Data is the new oilTài liệu liên quan:
-
Lecture Administration and visualization: Chapter 5.1 - Exploratory data analysis
83 trang 22 0 0 -
Lecture Administration and visualization: Chapter 7 - Data visualization charts
72 trang 17 0 0 -
Lecture Administration and visualization: Chapter 6 - Tools for data visualization
33 trang 16 0 0 -
Lecture Administration and visualization: Chapter 8.2 - Interactive visualization
31 trang 16 0 0 -
Lecture Administration and visualization: Chapter 3.3 - Data lake
45 trang 15 0 0 -
Lecture Administration and visualization: Chapter 2.1 - File management
29 trang 15 0 0 -
Lecture Administration and visualization: Chapter 2.2 - Hadoop distributed file system (HDFS)
31 trang 13 0 0 -
Lecture Administration and visualization: Chapter 3.1 - Data modelling and databases
56 trang 13 0 0 -
Lecture Administration and visualization: Chapter 8.1 - Interactive visualization
48 trang 12 0 0 -
Lecture Administration and visualization: Chapter 3.2 - Data modelling and databases OLTP & OLAP
71 trang 12 0 0