Data Analytics with Spark Using Python

Data Analytics with Spark Using Python
Author :
Publisher : Addison-Wesley Professional
Total Pages : 772
Release :
ISBN-10 : 9780134844879
ISBN-13 : 0134844874
Rating : 4/5 (874 Downloads)

Book Synopsis Data Analytics with Spark Using Python by : Jeffrey Aven

Download or read book Data Analytics with Spark Using Python written by Jeffrey Aven and published by Addison-Wesley Professional. This book was released on 2018-06-18 with total page 772 pages. Available in PDF, EPUB and Kindle. Book excerpt: Solve Data Analytics Problems with Spark, PySpark, and Related Open Source Tools Spark is at the heart of today’s Big Data revolution, helping data professionals supercharge efficiency and performance in a wide range of data processing and analytics tasks. In this guide, Big Data expert Jeffrey Aven covers all you need to know to leverage Spark, together with its extensions, subprojects, and wider ecosystem. Aven combines a language-agnostic introduction to foundational Spark concepts with extensive programming examples utilizing the popular and intuitive PySpark development environment. This guide’s focus on Python makes it widely accessible to large audiences of data professionals, analysts, and developers—even those with little Hadoop or Spark experience. Aven’s broad coverage ranges from basic to advanced Spark programming, and Spark SQL to machine learning. You’ll learn how to efficiently manage all forms of data with Spark: streaming, structured, semi-structured, and unstructured. Throughout, concise topic overviews quickly get you up to speed, and extensive hands-on exercises prepare you to solve real problems. Coverage includes: • Understand Spark’s evolving role in the Big Data and Hadoop ecosystems • Create Spark clusters using various deployment modes • Control and optimize the operation of Spark clusters and applications • Master Spark Core RDD API programming techniques • Extend, accelerate, and optimize Spark routines with advanced API platform constructs, including shared variables, RDD storage, and partitioning • Efficiently integrate Spark with both SQL and nonrelational data stores • Perform stream processing and messaging with Spark Streaming and Apache Kafka • Implement predictive modeling with SparkR and Spark MLlib


Data Analytics with Spark Using Python Related Books

Data Analytics with Spark Using Python
Language: en
Pages: 772
Authors: Jeffrey Aven
Categories: Computers
Type: BOOK - Published: 2018-06-18 - Publisher: Addison-Wesley Professional

DOWNLOAD EBOOK

Solve Data Analytics Problems with Spark, PySpark, and Related Open Source Tools Spark is at the heart of today’s Big Data revolution, helping data profession
Data Analysis with Python and PySpark
Language: en
Pages: 454
Authors: Jonathan Rioux
Categories: Computers
Type: BOOK - Published: 2022-03-22 - Publisher: Simon and Schuster

DOWNLOAD EBOOK

When it comes to data analytics, itpays to think big. PySpark blends the powerful Spark big data processing engine with the Python programming language to provi
Frank Kane's Taming Big Data with Apache Spark and Python
Language: en
Pages: 289
Authors: Frank Kane
Categories: Computers
Type: BOOK - Published: 2017-06-30 - Publisher: Packt Publishing Ltd

DOWNLOAD EBOOK

Frank Kane's hands-on Spark training course, based on his bestselling Taming Big Data with Apache Spark and Python video, now available in a book. Understand an
Advanced Analytics with Spark
Language: en
Pages: 276
Authors: Sandy Ryza
Categories: Computers
Type: BOOK - Published: 2015-04-02 - Publisher: "O'Reilly Media, Inc."

DOWNLOAD EBOOK

In this practical book, four Cloudera data scientists present a set of self-contained patterns for performing large-scale data analysis with Spark. The authors
Hands-On Big Data Analytics with PySpark
Language: en
Pages: 172
Authors: Rudy Lai
Categories: Computers
Type: BOOK - Published: 2019-03-29 - Publisher: Packt Publishing Ltd

DOWNLOAD EBOOK

Use PySpark to easily crush messy data at-scale and discover proven techniques to create testable, immutable, and easily parallelizable Spark jobs Key FeaturesW