Data Engineer at Sorabel
Jakarta, ID / Bandung, ID / Yogyakarta, ID
- Design, develop, and maintain our data infrastructure.
- Optimization/modification of data flow/pipeline to handle 3Vs of big data (Volume, Velocity, Variety).
- Develop custom ETL jobs to cater custom requirements.
- Coordinate with other departments (Commercial, Marketing, etc) to fulfil/adapt their data requirements/requests.
- Make sure the end user of the data (Analysts, Data Scientists, etc) can query the data seamlessly for their use.
- Explore/learn new technologies that can complement or replace our current stack to improve it.
- Background in server-side software development in Linux environment (It's a plus if can do front-end as well).
- Degree in Computer Science/Engineering/Mathematics is a good start, but not a must.
- Not scared of reading technical documentation or source code.
- Programming language:
- Relevant experience:
- Google Cloud Platform Data Infrastructure (Big Query, Dataproc, Dataflow)
- Hadoop (HDFS, MR, Yarn)
- Hadoop File Formats & Compression (Parquet, ORCFile, Snappy, gzip)
- SQL on Hadoop (Hive, SparkSQL, Impala)
- NoSQL (BigTable, HBase, Cassandra)
- RDBMS (MySQL)
- Distributed processing engine (Spark, Flink)
- Data Ingestion & Message Processing (RabbitMQ, ActiveMQ, ZeroMQ, Kafka, Flume)
- Stream processing (Spark Streaming, Storm)