Please enable JavaScript to view the site

Ankai Liang

LinkedIn | 917-912-7709-Cell | liangankai123@gmail.com

SUMMARY

5.5 years of industrial experience on large-scale data processing platforms and back-end web search architecture. Drove several launches of semi-structured search features for different verticals in Google and multiple search offline indexing projects in TikTok. Experienced in data pipeline development, back-end web development, personalized recommendation infrastructure, and quality evaluation and analysis.

EDUCATION

  • Stevens Institute of Technology, Hoboken, NJ
    Master of Science in Computer Science
    GPA: 3.90/4.00
    Dec 2017

  • Beijing University of Posts and Telecommunications, Beijing, China
    Bachelor of Engineering in Electronic Information Science
    Jun 2014

Work Experience

TikTok Inc. (May 2023 - Now)

Software Engineer, Search Back-end
Languages/Technologies: go, java // FLink, ETL, distributed systems, back-end web development

  • Completed the migration of the TikTok Live indexing pipeline from Go + TCE to the Java + Flink framework, resulting in a substantial decrease in latency by 60.8% and a 33% reduction in output QPS.
  • Built an offline monitoring dashboard for the TikTok search operations, elevating visibility into index flows. This enhancement not only streamlines traffic statistics but also simplifies issue tracking and resolution.
  • Built Batch indexing life-cycle management system which simplified the feeding process, enhancing the efficiency, controllability and transparency of the batch indexing tasks.

Google Inc. (Aug 2018 - Mar 2023)

Software Engineer, Events Search/What-to-X/Recipes Search
Languages/Technologies: C++, python // Recommendation System, semi-structured search, back-end web development, Knowledge Graphs, ETL, distributed system, quality evaluation and analysis

  • Implemented the back-end of personalized recommendation feature for “food” and “dish” recipes queries.
  • Developed and deployed an author book feature that displays books authored by an author when user search for him/her, which brought 87% increases in book item iterations.
  • Set up the live experiments and query satisfactions surveys with real traffic for latency and quality evaluation. Compared to the first launch candidate, 3 major improvements reduces 65% latency increases.
  • Developed and maintained an end-to-end distributed and batch-processing data pipeline for ETL processing from Web docs to Knowledge Graphs. Throughput is around 30TB/day.
  • Designed events related metadata(source page, ticketing URL, event status and so on) schema and completed the data on-boarding on Knowledge Engine.
  • Designed and Implemented the monitoring system for the counters fluctuations in events data pipeline.

Marlabs Inc. (Feb 2018 - Jul 2018)

Software Development Engineer, Trainee
Languages/Technologies: Scala // Spark, MapReduce, Hive, Kafka, Hortonworks
Big Data Engineering. Used computation frameworks and applications such as Spark, Hadoop MapReduce, Hive or Kafka to process data and build data pipelines within Hortonworks Data Platform.

Academic Projects

Open Source Project Contribution | Alluxio

April 2017

  • Contributed to Open Source Project Alluxio in a collaborative development environment on GitHub. Alluxio is a virtual distributed storage system. It enables any application to interact with any data from any storage system at memory speed.
  • Added sub-command functionality to provide help message for ‘bin/alluxio fs’.
  • Completed corresponding remote debugging and unit test.

Stock Information Web Application

Mar 2017

  • Build a web application that displays real-time price changes for a given stock.
  • Used Kafka as Data Ingestion layer, Redis as Data Storage layer, and smoothie.js to visualize data.
    Course Selection Helper for Stevens | Web Server and Wechat API June 2016
  • Built WeChat interface, using Nginx, uWSGI and Flask as the framework to run Python applications.
  • Built a web application which manages course information by Node.js. It provides dashboards reporting grade distributions, course intensity, course evaluations and course enrollments.
  • Project being highlighted on poster board during 2016 IEEE symposium.

June 2014

  • Completed data mining project to report the top ten most popular sites in Beijing to government department for tourism development.
  • Used DBSCAN algorithm to cluster points of stay. Filtered out redundant clusters where users visited periodically like homes, offices and stations.
  • Generated 352 clusters form 24,876,978 GPS points of 182 users. Used HITS algorithm to evaluate the users and clusters, then obtained final recommendation.
  • Generated visual display and reports on results of the analysis using Baidu Map APIs.

Technology Skills

Familiar: C/C++, Java, go, Flink; ETL, distributed systems, back-end web development, Knowledge Graphs
Base: Python, Scala; Spark, Hadoop MapReduce, Kafka

Course

Artificial Intelligence, Distributed System & Cloud Computing, Database Management Systems, Systems Administration, Smartphone and Mobile Security, Text Ming, Knowledge Discovery & Data Mining, System Administration, Algorithms

Comment