Treasure Data:

At Treasure Data, we’re on a mission to radically simplify how companies use data and AI to create connected customer experiences. Our intelligent customer data platform (CDP) drives revenue growth and operational efficiency across the enterprise to deliver powerful business outcomes.

We are thrilled that Forrester has recognized Treasure Data as a Leader in The Forrester Wave™: Customer Data Platforms For B2C. It's an honor to be acknowledged for our efforts in advancing the CDP industry with cutting-edge AI and real-time capabilities.

Furthermore, Treasure Data employees are enthusiastic, data-driven, and customer-obsessed. We are a team of drivers—self-starters who take initiative, anticipate needs, and proactively jump in to solve problems. Our actions reflect our values of honesty, reliability, openness, and humility.

Your Role:

The Plazma team at Treasure Data is one of the essential elements of our CDP solution and is part of the Core Services group, which supports customer data ingestion and availability at a rate of 70B records per day. You are expected to help the team develop the future of our Hadoop/Hive & Trino query engines and expand from there into our in-house developed storage solution. This includes maintaining technical excellence to address challenges that currently lack industry-wide solutions and delivering the roadmap together with your team. Our team consists of Big Data experts across Japan, Korea and Canada who are passionate about OSS contribution, and we take pride in the quality of service we offer.

Responsibilities & Duties:

Design and develop Hadoop/Hive & Trino solutions, providing technical expertise for modern data architecture assessment and use case development
Establish engineering standards for design, development, tuning, deployment, and maintenance of advanced data access frameworks and distributed systems
Collaborate with your team to define product roadmaps based on operational needs and customer-requested features while mentoring and training new team members
Own version and release management, including baseline evaluation, patch backporting, and deployment of customer-facing features
Coordinate with Support and Product teams on release cycles and feature delivery
Contribute to Hadoop/Hive & Trino OSS through bug fixes, new features, and technical documentation
Partner with SRE to automate cluster operations, reducing operational overhead through automated lifecycle management and load balancing workflows
Design and implement observability solutions, including health metrics, capacity planning tools, and automated failure detection and recovery systems
Provide expert customer support, including on-call responsibilities, escalation handling, and in-depth troubleshooting of performance and defect issues
Develop custom technical solutions, including user-defined functions (UDFs) and specialized tooling for Hadoop/Hive & Trino

Required Qualifications:

5+ years building and operating distributed systems
Strong Java and deep understanding of algorithms, data structures, and distributed systems fundamentals
Solid understanding of cloud architecture and services in public clouds like AWS, GCP, or Microsoft Azure
Strong capability in implementing new and improved data solutions for multi-tenant environments
Experience in developing use cases, functional specs, design specs, etc.
Experience working with distributed, scalable Big Data stores or NoSQL, including HDFS, S3, Cassandra, Big Table, etc.
Strong analytical and communication skills; able to influence across Product, SRE, and Support

It would be nice if you had:

Understanding of the capabilities of Hadoop/Hive or Trino
Proven experience operating production query engines on a petabyte scale
Microservices architecture, data integration patterns, and extending OSS
Infra-as-Code, SRE practices, and advanced observability
UDF development and familiarity with data visualization ecosystems
Security and privacy-by-design expertise
Experience with storage patterns and optimizations for massive parallel processing

Physical Requirements:

3 days at Treasure Data Office

About Treasure Data:

Treasure Data is the Intelligent Customer Data Platform (CDP) built for enterprise scale and powered by AI. Recognized as a Leader by Forrester and IDC, Treasure Data empowers the world’s largest and most innovative companies to deliver hyper-personalized customer experiences at scale that increase revenue, reduce costs, and build trust.

Through unique capabilities such as the Diamond Record, AI Agent Foundry, and AI Decisioning with Real-Time Personalization, Treasure Data enables marketing and CX teams to personalize cross-channel engagement in real-time, optimize marketing spend while increasing ROI, and drive customer lifetime value through more intelligent retention and loyalty.

Our Dedication to You:

We value and promote diversity, equity, inclusion, and belonging in all aspects of our business and at all levels. Success comes from acknowledging, welcoming, and incorporating diverse perspectives.

Diverse representation alone is not the desired outcome. We also strive to create an inclusive culture that encourages growth, ownership of your role, and achieving innovation in new and unique ways. Your voice will be heard, and we will help amplify it.

Agencies and Recruiters:

We cannot consider your candidate(s) without a contract in place. Any resumes received without having an active agreement will be considered gratis referrals to us. Thank you for your understanding and cooperation!

Senior Software Engineer - Query Engines & Storage

About Treasure Data

Senior Software Engineer - Query Engines & Storage

Already working at Treasure Data?