- Helping delivery account teams in solving big data problems.
- Working closely with the delivery teams to deliver a project.
- Working with sales / pre-sales team to enable new opportunities by building prototypes, PoCs.
- Working with core team to build new capabilities within COE as well as in accounts.
- Looking at the market , technology trends and analyse the impact of the same.
- Get oriented in big data solutioning.
- Strong skills in Java, scala /Python, SQL
- Hands-on in one of the hadoop distributions: Hortonworks or Cloudera.
- Strong understanding of distributed querying, performance tuning, horizontal scaling / data partitioning concepts.
- Data access: Strong experience with SQL-On-Hadoop frameworks Hive, (Hive-on-spark/Hive-on-Tez), SparkSQL, Impala, Drill, Presto.
- Data processing : Working knowledge in batch frameworks like, MR, Spark or Flink-
- ould have worked on these frameworks in standalone or on cluster mode (hadoop).
Strong data modelling and RDBMS skills.
- Strong understanding of basic distributed computing concepts like data locality.
- Working knowledge in search stores like Elasticsearch or Solr.
- Good exposure to DWH concepts.
Take a Look: Top 4 Apache Spark Use cases in Real Time