Title | : | 5 Tips for Building a Data Science Platform |
Speaker | : | David Chaiken (CTO, Altiscale, Palo Alto, USA) |
Details | : | Thu, 11 Feb, 2016 9:00 AM @ BSB 361 |
Abstract: | : | Data scientists need to be advocates for a self-serve, Hadoop-based environment that is productive, reliable and a joy to use. This talk presents five tips to make your Big Data environment successful, and shows how best-of-breed tools like Spark fit together with the components of the Hadoop ecosystem. These tips are valid whether the environment is built on premises, on top of infrastructure as a service, or deployed as a service. That said, the talk concludes by pointing out that buying the underlying platform as a service is the fastest path to deriving business value from big data. Bio-sketch: David Chaiken has been building large scale distributed systems for more than 25 years. He currently works at Altiscale, a start-up in Palo Alto that runs Hadoop as a service for other companies. Before Altiscale, David served as the Chief Architect of Yahoo, where he led teams that developed consumer advertising and media systems with Hadoop at their core. Over his career, David has also built voice search products for consumers, mobile enterprise applications, network management systems, project management software, a large-scale multiprocessor architecture, a tablet computer, and four or so other information appliances. He has managed both hardware and software development teams, but prefers individual contributor roles. David has been hacking since his parents sat him down in front of an IBM card punch. His favorite technologies include the RSA encryption algorithm, the C programming language, the ARM instruction set architecture, the CentOS distribution of Linux, and the build-on-grid-push-to-serving design pattern. In 1994, David earned a Ph.D. in electrical engineering and computer science from MIT. |