Sunday, March 25, 2018

MapR Spark Certification tips


I recently cleared MapR spark certification and would like to share some tips as I was asked to do so, (here you go my friends)

I divided this blog into 3 sections. 


  • prerequisite for exam
  • exam topics and must cover material
  • tips (don't ignore the topics at the end of this blog please)



Prerequisite 

First and foremost  -  Work on Spark and Scala for at least a year before attempting the exam. below points summarize the need.

  • You should have basic knowledge of distributed functional programming
  • Hands-on experience on spark 
  • Have good exposure to Scala programming(not expecting to be expert but read the code and answer sensibly).

Exam topics and must cover Material

Lots of programming questions in the exam, code snippet is provided and ask solve it and answer. If I remember correctly only 10% questions were theoretical ( like true/false or which algorithm to use kind).


I referred lot of materials(online books/videos/ edx courses in last 2 years) for my preparation but if I want to zero-in for what should be the mandatory for MapR certification - here is the list you should not miss any bit and I suggest to go over 4-5 times before taking the exam. 
  • Instructor and Virtual Instructor-led Training(Training ppt and Lab guide)
    • DEV 360 – Developing Spark Applications
    • DEV 361 - Build and Monitor Apache Spark Applications
    • DEV 362 - Spark Streaming, Spark MLLib - Machine Learning, Graphx
  • Book - Learning Spark
  • Spark official documentation
    • pay more attention to RDD, Closure, Accumulator, Broadcast variables.  
      • http://spark.apache.org/docs/latest/quick-start.html
      • http://spark.apache.org/docs/latest/rdd-programming-guide.html
    • MlLib - http://spark.apache.org/docs/latest/ml-guide.html



Topics covered in the exam 

Topic NameYour Score
Load and Inspect Data in Apache Spark
                               
xx%
Advanced Spark Programming and Spark Machine Learning MLLib
xxx%
Monitoring Spark Applications
xx%
Work with Pair RDD
                                 
xx.x%
Spark Streaming
xx%
Work with DataFrames
xx%
Build an Apache Spark Application
                                   
xxx%

Tips  - 

Normally when anyone start preparing for the exam - the good start will be to go through below link 

https://mapr.com/blog/how-get-started-using-apache-spark-graphx-scala/assets/spark-certification-study-guide.pdf

The question on this guide is way to basic comparing to the real exam. the exam was much-much harder.
  • Lots of question on core concepts of RDD and pair RDD
  • Dataframes are the next important
  • About 25% questions on Spark Streaming and Spark MLLib so prepare well on 

You don't want to ignore any of the below topics at any cost

Silent topics which you don't want to get as surprise in exam


  • Accumulator and Broadcast variables
  • Scala Closures
  • Narrow and Wide Dependencies
  • Partitioning  
  • Formating questions – saveAsTextFile() – need to save without bracket/parenthesis
  • Prepare well for mkString(“,”) and formating 
  • flatMap functions
  • MapPartitions
  • There was a question on byKey transformation and also on hadoop streaming which I am not sure about. 
Hope this blog will help in your preparation. Please let me know or email me if you have any other questions. Happy Studying

At the end  - Here is my certification

3 comments:

  1. very informative blog and useful article thank you for sharing with us , keep posting learn more Big Data Hadoop Online Training Hyderabad

    ReplyDelete
  2. very informative blog and useful article thank you for sharing with us spark online training Hyderabad

    ReplyDelete
  3. The article is so appealing. You should read this article before choosing the AWS big data consultant you want to learn.

    ReplyDelete

Websphere Dummy certificate expired - DummyServerKeyFile.jks , DummyServerTrustFile.jks

If you faced issue with ibm provided dummy certificate expired just like us and looking for the solution.  This blog is for you.  You can re...