Databricks is a platform that enables data aggreggation, data cleansing, machine learning, data sharing and much more. It is served on multiple platforms including AWS and Microsoft Azure. Data can be ingested from a wide range of sources and can be trained on machine learning models


Databricks allows for coding in 3 different languages:

  • SQL
  • Python
  • Scala It uses Notebooks to execute code and it can be coded in any of the above languages, giving Developers/Data Scientists to code how they feel more comfortable.


Deployment in my opinions is a little complex due to the number of tools available to the user

