發表文章

目前顯示的是有「IBM Professional Certificate in Data Engineering」標籤的文章

IBM: Big Data, Hadoop, and Spark Basics

圖片
 IBM: Big Data, Hadoop, and Spark Basics https://www.edx.org/learn/big-data/ibm-big-data-hadoop-and-spark-basics 學習目標 Explain the impact of Big Data, including use cases, tools, and processing methods. 解釋大數據的影響,包括用例、工具和處理方法。 Describe Apache Hadoop architecture, ecosystem, practices, and user-related applications, including Hive, HDFS, HBase, Spark, and MapReduce. 描述 Apache Hadoop 架構、生態系統、實踐和與使用者相關的應用程式,包括 Hive、HDFS、HBase、Spark 和 MapReduce。 Apply Spark programming basics, including parallel programming basics for DataFrames, data sets, and Spark SQL 應用Spark 程式設計基礎知識,包括 DataFrames、數據集和 Spark SQL 的並行程式設計基礎知識 Use Spark’s RDDs and data sets, optimize Spark SQL using Catalyst and Tungsten, and use Spark’s development and runtime environment options. 使用 Spark 的 RDD 和數據集,使用 Catalyst 和 Tungsten 優化 Spark SQL,並使用 Spark 的開發和運行時環境選項。 Syllabus  教學大綱 Module 1: What is Big Data? Module Introduction and Learning Objectives What is Big Data? Impact of Big Data Parallel Processing, Scaling, and ...