AI/ML - Sr Search Engineer, Siri Data at Apple (Seattle, WA)

AI/ML - Sr Search Engineer, Siri Data at Apple (Seattle, WA)


Add To Bookmarks
Company:
Location: Seattle, WA
Type: Full Time
Created: 2021-05-04 05:00:27

Apply Here


Siris universal search engine powers search features across a variety of Apple products, including Siri Assistant, Spotlight, Safari, Messages, and News. The Siri Data organization seeks to improve Siri by using data as the voice of our customers. Within this organization the Search Data Engineering team builds systems that process data reliably at scale to generate scalable and high quality datasets that support confident, data-driven decision making for Siri Search.Were looking for exceptional data engineers who are passionate about our product and values; who love working with data at scale; and who are committed to that hard work necessary to continuously improve. As a part of this group, you will work with petabytes of data daily using diverse technologies like Spark, Flink, Kafka, Hadoop and others. You will be expected to effectively partner with upstream engineering teams and downstream consumers, including analysts and product engineers.In this role you will build datasets to support analytics, experimentation, and machine learning. Specifically, you will build out stream processing applications powering real-time metrics and you will help to drive our self-serve strategy for reporting on-behalf of data scientists and product engineers as we collectively make Siri better.On any given day you might be ...- Developing data pipelines and/or software libraries to process, transform, and analyze data to identify signals from the billions of events we collect every day- Designing and building abstractions that hide the complexity of the underlying big data stack (HDFS, Hadoop, Hive, Impala, Spark, Kafka, Parquet, etc) and that allow partners to focus on their strengths: product, data modeling, data analysis, search, information retrieval, and machine learning- Defining and implementing the source of truth for our most fundamental datasuch as search activity and contentas well as our core metrics across a variety of products- Optimizing end-to-end workflows of data users (crafting libraries, providing abstractions to define jobs, scheduling data pipelines, managing access to datasets, etc)- Building internal services and tools to help in-house partners implement, deploy and analyze datasets with a high level of autonomy and limited friction.- Surfacing datasets in near-real-time to mission critical products and business applications throughout the company, providing the signal that feeds our machine learning algorithms as well as our daily product-defining decisions- Automating and handling lifecycle of datasets (schema evolution, metadata store, backfill management, deprecation, migration)- Improving the quality and reliability of our pipelines (monitoring, retry, failure detection)