Democratizing AI: Revolutionizing Machine Learning at LinkedIn

Democratizing AI: Revolutionizing Machine Learning at LinkedIn

Table of Contents

  1. Introduction
  2. Machine Learning at LinkedIn
    • Overview of Products and AI Contribution
    • Scaling Technology and People
  3. Key Products and Their AI Components
    • People You May Know
    • Feed Product
    • Job Recommendations
    • Learning Product
    • Sales Navigator Platform
    • Recruiter Product
  4. Designing AI and Machine Learning for Products
    • Starting with Product Goals
    • Working with Data Science Analytics
    • Joining Labels with Features
    • Training and Modeling
    • Shipping Models to Production
    • Continual Product Refinement
  5. Machine Learning in Various Parts of the Stack
    • Anti-Abuse and Fraud Detection
    • Build System Optimization
  6. Challenges in Standardizing Machine Learning Components
    • Complexity and Second System Syndrome
    • Multiple Modeling Technologies
    • Mission-Critical Services
    • Democratization of Machine Learning
  7. Need for Extensibility in Machine Learning Infrastructure
  8. Balancing Speed, Quality, and Cost
  9. Goals of the Productive ML Project
  10. Feature Marketplace and Assurance
  11. Organizing the Project with a Team of Teams Model
  12. Distributed and Global Team Collaboration
  13. The Model as a Directed Acyclic Graph (DAG)
  14. Feature Engineering and Exploration
  15. Algorithms and Modeling Approaches
    • Deep Learning with TensorFlow
    • Tree Ensembles with XGBoost
    • General Linear Mixed Models
  16. Building and Training Models with Quasar
  17. Online Model Serving and Infrastructure
  18. Keeping Models Healthy with Continuous Monitoring
  19. Training Models and Deployment Cycles
  20. Scaling Up Machine Learning Skills at LinkedIn
    • AI Academy and Training Managers
    • Avoiding the Support Trap
  21. Conclusion

👉 Machine Learning at LinkedIn: Rethinking the Future

LinkedIn has emerged as a leading platform for professionals across the globe, connecting people, fostering Meaningful relationships, and providing a wealth of opportunities. To further enhance its services, LinkedIn is engaging in a complete rethinking of its machine learning approach. This article will delve into the journey that led to this transformation, the key products offered by LinkedIn, and how artificial intelligence (AI) is revolutionizing the platform.

Introduction

In today's digital age, machine learning is increasingly shaping the way businesses operate and deliver value to their customers. LinkedIn, the world's largest professional network, is no exception. With a user base of over 500 million members and more than 10 million jobs available, LinkedIn has harnessed the power of AI and machine learning to provide a wide range of products and services that cater to the needs and aspirations of its users.

In this article, we will explore LinkedIn's machine learning journey, the products powered by AI, the challenges faced, and the strategies employed to Scale technology and people to meet the ever-growing demands. Join us as we uncover the inner workings of LinkedIn's machine learning infrastructure and how it drives the success of the platform.

Machine Learning at LinkedIn

Overview of Products and AI Contribution

LinkedIn boasts a wide range of products, all powered by AI and machine learning. These products provide valuable insights and connections, offering users a comprehensive view into what LinkedIn calls "the economic graph." This graph captures the relationships between professionals, their connections, and beyond, signifying the vast interconnectedness of the professional world.

One such product is "People You May Know," which addresses the challenge of finding missing connections within the economic graph. By suggesting potential connections based on various factors, this product enhances networking capabilities and fosters valuable relationships.

LinkedIn's "Feed" product delivers personalized information and updates from users' connections, ensuring they are always up to date with the latest articles, updates, and even job opportunities. This real-time feed brings value to users' professional lives, enabling them to stay informed and seize opportunities that Align with their goals.

Another key product, "Job Recommendations," helps users find Relevant job opportunities. Leveraging AI and machine learning, this product recommends jobs that match users' skills and preferences, bridging the gap between job seekers and their dream roles. This recommendation engine factors in specialized courses from LinkedIn Learning, addressing any skills gaps for users looking to make a career transition.

For sales professionals, LinkedIn's "Sales Navigator" platform offers tools to identify influencers within companies and connect with decision-makers. This invaluable resource streamlines the sales process and helps drive revenue growth by enabling meaningful, targeted interactions.

Similarly, the "Recruiter" product empowers companies to find the perfect talent that will drive their mission forward. By leveraging AI and machine learning, LinkedIn helps businesses identify the ideal candidates for their roles, ultimately shaping their success.

Scaling Technology and People

At LinkedIn, AI and machine learning are at the core of their product development and strategy. The process of designing AI and machine learning models begins with a focus on product goals. LinkedIn collaborates closely with product managers and customers to identify needs, explore budgets and timelines, and determine relevant metrics of success.

This collaboration extends to the data science and analytics teams, ensuring the product metrics align with relevance. To refine these metrics, LinkedIn leverages concrete, measurable signals that will serve as labels during the machine learning stage. These labels are then combined with features derived from the economic graph, effectively propelling the training and modeling stage.

Once the models are trained, they are deployed to production. Rather than a haphazard deployment, LinkedIn ensures the models undergo extensive A/B testing to evaluate their performance against the identified product and relevance metrics. This continuous cycle of improvement and refinement is crucial to stay ahead in an ever-evolving landscape.

Machine learning is not limited to specific products at LinkedIn; it permeates the entire stack. User safety and trust are paramount, with anti-abuse, spam detection, and fraudulent account detection powered by machine learning algorithms. In addition, even the build system benefits from AI, enabling efficient deployment and troubleshooting.

Key Products and Their AI Components

LinkedIn's suite of products offers invaluable features, all made possible through AI and machine learning. Let's explore some of these products and how they contribute to users' professional success.

People You May Know 👥

The "People You May Know" feature addresses the challenge of finding connections that users may have missed within the vast LinkedIn network. By leveraging AI algorithms and machine learning models, LinkedIn suggests potential connections based on mutual connections, similar job roles, and other relevant factors.

This product not only enhances networking but also uncovers Hidden opportunities for professional growth. By expanding users' networks and connecting them with valuable connections, LinkedIn aims to bring mutual value to professionals at all levels.

Feed Product 📰

LinkedIn's "Feed" product provides users with personalized updates, articles, and job opportunities from their connections and industries of interest. By leveraging AI recommendations, the feed ensures users stay informed about the latest industry trends, advancements, and opportunities.

The AI algorithms behind the Feed product curate content based on users' professional interests, providing a tailored experience that adds value to their professional lives. With the availability of real-time updates, LinkedIn users are always one step ahead, enabling them to make informed decisions and seize opportunities as they arise.

Job Recommendations 📝

LinkedIn's "Job Recommendations" product revolutionizes the job search process by leveraging AI and machine learning algorithms. With millions of job postings, finding the perfect role can be a daunting task. LinkedIn's job recommendation system helps bridge the gap between job seekers and their dream positions by matching their skills and preferences with suitable job opportunities.

The AI-powered recommendation engine analyzes various factors, including users' profiles, skills, and past experiences, to provide personalized job suggestions. By utilizing AI, LinkedIn aims to facilitate the job search process and connect professionals with opportunities that align with their aspirations.

Learning Product 🎓

As a leading online learning platform, LinkedIn Learning offers a vast library of courses to enhance professional skills and knowledge. Leveraging AI, LinkedIn's learning product identifies knowledge gaps and provides personalized course recommendations to help professionals upskill and stay relevant in a rapidly evolving job market.

By assessing users' profiles, skills, and industry trends, LinkedIn Learning recommends courses tailored to their specific needs. This personalized approach enables professionals to acquire the skills necessary for their desired career paths and achieve their professional aspirations.

Sales Navigator Platform 💼

In the business-to-business (B2B) space, LinkedIn's "Sales Navigator" platform empowers sales professionals by identifying key influencers within organizations and facilitating meaningful connections. By leveraging AI and machine learning, the Sales Navigator platform streamlines the sales process and helps salespeople navigate complex organizations more effectively.

Through AI-powered algorithms, the platform identifies decision-makers, influencers, and potential leads, enabling sales professionals to build relationships and drive revenue growth. By removing the guesswork from sales prospecting, LinkedIn's Sales Navigator empowers sales teams to focus their efforts on high-value opportunities.

Recruiter Product 💼

LinkedIn's "Recruiter" product caters to companies seeking top talent to drive their mission forward. By leveraging AI and machine learning, LinkedIn's Recruiter product revolutionizes the recruitment process, connecting businesses with the most suitable candidates for their open positions.

Powered by intelligent algorithms, Recruiter recommends potential candidates based on their skills, experiences, and compatibility with the company's needs. By leveraging LinkedIn's extensive database of professional profiles, companies can identify and attract the right talent to achieve their business objectives.

Designing AI and Machine Learning for Products

To achieve the success of its machine learning products, LinkedIn adopts a methodology that starts with the product goals. By working closely with product managers, customers, and members of the LinkedIn community, the company identifies the needs, sets the budget, and determines the timeline required for success.

Collaboration with data science and analytics teams is essential during this stage to refine the product metrics to ensure they align with the product goals and relevance. Concrete, measurable signals are selected as labels when moving into the machine learning stage. These labels are then combined with features drawn from the economic graph, forming the foundation for model training.

LinkedIn places great emphasis on combining the insights and expertise of product managers, data scientists, and engineers to ensure the success of its machine learning models. By taking a holistic approach, from ideation to deployment, LinkedIn guarantees that its products provide immense value to users while meeting the company's business goals.

Throughout the design and modeling process, LinkedIn's engineers prioritize speed, efficiency, and iteration. By providing feature marketplaces, experimenting with various algorithms, and continuously refining models, LinkedIn ensures that its machine learning infrastructure is adaptable, scalable, and capable of driving innovation in the professional space.

In the following sections, we will explore the various components of LinkedIn's machine learning infrastructure, highlighting the role they play in delivering cutting-edge products and maintaining a competitive advantage in the ever-evolving technology landscape.

Key Products and Their AI Components

People You May Know 👥

LinkedIn's "People You May Know" builds connections by recommending potential contacts based on mutual connections, similar job roles, and other relevant factors. By leveraging AI algorithms and machine learning models, LinkedIn enhances networking capabilities and enables users to forge valuable professional relationships.

The AI components behind this product analyze vast networks of relationships within the LinkedIn platform to suggest meaningful connections. This feature addresses the challenge of discovering professionals who may not be directly connected, unlocking new opportunities and fostering collaboration.

Feed Product 📰

LinkedIn's "Feed" product delivers personalized content, including articles, updates, and even job opportunities, from a user's connections and industries of interest. By leveraging AI and machine learning algorithms, the feed curates relevant information tailored to each user's professional interests and career trajectory.

The AI components responsible for the feed analyze a user's profile, connections, and industry-related data to identify content that adds value to their professional lives. By staying up to date with the latest industry trends, insights, and opportunities, LinkedIn users can make informed decisions and stay ahead in their respective fields.

Job Recommendations 📝

LinkedIn's "Job Recommendations" product revolutionizes the job search process by leveraging AI and machine learning algorithms. With millions of job postings on the platform, finding the perfect job can be overwhelming. LinkedIn's recommendation engine addresses this challenge by matching job seekers with relevant opportunities based on their skills, experiences, and preferences.

By continuously analyzing user profiles, employment history, and emerging job trends, LinkedIn's algorithms provide personalized job suggestions. This powerful tool empowers job seekers, connecting them with relevant opportunities that align with their career goals.

Learning Product 🎓

LinkedIn is renowned for its extensive online learning platform, LinkedIn Learning. Leveraging AI, this product identifies knowledge gaps and provides personalized Course recommendations to help professionals develop their skills and stay competitive in the job market.

By analyzing users' profiles, skill sets, and industry trends, LinkedIn Learning recommends courses tailored to their specific needs. This personalized approach ensures that professionals can continuously upskill themselves in the areas most relevant to their career paths.

Sales Navigator Platform 💼

LinkedIn's "Sales Navigator" platform revolutionizes the way sales professionals approach their markets. By leveraging AI and machine learning, this platform identifies key decision-makers and influencers within organizations, enabling sales teams to navigate complex business landscapes more effectively.

The AI components analyze users' connections, job roles, and past interactions to identify potential leads and decision-makers. By streamlining the process of business-to-business (B2B) sales, LinkedIn's Sales Navigator enhances sales teams' productivity, ensuring they focus on high-value opportunities.

Recruiter Product 💼

LinkedIn's "Recruiter" product helps companies identify and attract top talent to drive their missions forward. By utilizing AI and machine learning algorithms, this product connects hiring companies with the most suitable candidates based on their skill sets, experiences, and potential.

The AI components behind LinkedIn's Recruiter product analyze vast databases of professional profiles to recommend ideal candidates for open positions. This powerful tool empowers recruitment teams to source and engage talent with precision and efficiency.

Designing AI and Machine Learning for Products

LinkedIn understands the significance of designing AI and machine learning models that align with product goals and provide exceptional user experiences. The design process begins by collaborating with product managers, customers, and members of the LinkedIn community to identify needs, set budgets, and determine relevant metrics of success.

Working closely with data science and analytics teams, LinkedIn refines the product metrics to ensure they accurately measure relevance and success. By selecting concrete and measurable signals, LinkedIn creates labeling criteria for the machine learning stage. These labels are combined with features drawn from the economic graph, forming a strong foundation for model training.

LinkedIn's approach focuses on coordination between product managers, data scientists, and engineers to guarantee the success of its machine learning models. Continuous collaboration and iteration contribute to the excellence of the end product, ensuring maximum value for users and aligned business goals for LinkedIn.

Throughout the model design and training stage, LinkedIn emphasizes the importance of speed, efficiency, and iterative development. By providing feature marketplaces, experimenting with diverse algorithms, and continuously refining models, LinkedIn's machine learning infrastructure remains adaptable, scalable, and capable of driving innovation in the professional world.

In the next sections, we will explore the different components of LinkedIn's machine learning infrastructure, unveiling their critical roles in delivering cutting-edge products and maintaining a competitive edge in a rapidly evolving technological landscape.

Challenges in Standardizing Machine Learning Components

LinkedIn's journey towards overhauling its machine learning stack has faced various challenges. One such challenge is the complexity of standardizing machine learning components across a diverse range of products and teams. In the past, standardization efforts often fell short, failing to reduce complexity and instead adding to the existing stack.

LinkedIn's machine learning landscape encompasses an array of modeling technologies, ranging from simple logistic regression to complex tree ensembles and deep learning models. Each modeling approach is tailored to address specific use cases and delivers unique functionalities. Maintaining cohesion and ensuring seamless integration across these diverse technologies has been a key challenge.

Furthermore, LinkedIn's machine learning products are mission-critical, meaning any disruption could have significant ramifications for the company and its users. Ensuring the reliability, robustness, and effectiveness of these products has required meticulous attention to detail.

As machine learning continues to democratize, LinkedIn recognizes the importance of embracing this trend and fostering a culture of collaboration and skill development across its engineering and product teams. The company understands that AI and machine learning skills are no longer confined to a select few but are now Present in various teams, including engineering, product management, and user interface design.

This influx of machine learning skills throughout the organization demands a concerted effort to reduce friction, complexity, and the need for teams to build their own models and infrastructure from scratch. To address this challenge, LinkedIn is building an extensible machine learning architecture that allows teams to seamlessly integrate new technologies and techniques without unnecessary technical debt.

Additionally, the rapid advancement of machine learning approaches and the proliferation of new tooling present a challenge for LinkedIn. The company recognizes that choosing a single stack or technology would be risky, as the industry is in a state of flux. Rather than prematurely committing to a single technology, LinkedIn aims to remain flexible, enabling teams to leverage a variety of tools to solve different problems.

By navigating these challenges and embracing a culture of continuous learning and improvement, LinkedIn is poised to advance the state of the art in machine learning and drive its products to new heights.

Need for Extensibility in Machine Learning Infrastructure

LinkedIn's machine learning initiatives encompass a vast and diverse set of products and features, each with unique requirements. As LinkedIn's infrastructure evolves, the company recognizes the need for extensibility to accommodate not only the core infrastructure and machine learning teams but also the end users, including UI teams.

LinkedIn aims to provide a platform that allows end users to easily integrate their own models and technological advancements within the existing architecture. This approach enables teams throughout the organization to leverage shared infrastructure and components, reducing complexities and technical debt.

The extensibility of LinkedIn's machine learning infrastructure empowers engineers to solve their own problems and iterate quickly. By removing barriers and enabling teams to utilize existing technologies, LinkedIn seeks to foster an environment of innovation, all while maintaining stability and scalability.

This extensibility also recognizes the fast-paced nature of the machine learning landscape. With advancements in algorithms, frameworks, and tooling happening daily, LinkedIn understands the importance of adapting to these changes. Rather than locking into a single technology stack, LinkedIn embraces a future where different technologies coexist, each offering unique value for specific models and problems.

By prioritizing extensibility and flexibility in its machine learning infrastructure, LinkedIn empowers its engineering teams to fully utilize their AI and machine learning skills, fostering a culture of innovation and collaboration.

Balancing Speed, Quality, and Cost

In the realm of software engineering, the classic trade-off is known as the "iron triangle" or the "Project Management triangle." This triangle consists of three key factors: speed, quality, and cost. The traditional belief states that you can only choose two out of the three, with the third being sacrificed. However, LinkedIn has devised a strategy to achieve a balance between all three factors.

Historically, LinkedIn has favored fast-paced development and low costs, thus delivering products quickly. While this approach brought immediate value, it also contributed to the accumulation of technical debt. To break free from this cycle and reduce complexity, LinkedIn now prioritizes the quality of its product offerings.

By investing in extensive testing, code reviews, and adopting industry best practices, LinkedIn ensures that its machine learning infrastructure is robust, reliable, and future-proof. This shift in focus from fast and cheap to high-quality builds a strong foundation for scalable growth while minimizing long-term costs.

LinkedIn's goal is to more than double model efficiency and productivity. By improving iteration speed and gathering early feedback, LinkedIn can drive improvements in models, product metrics, and relevance. This iterative process, coupled with continuous monitoring and refinement, guarantees the success of LinkedIn's machine learning infrastructure.

Through comprehensive training, thoughtful resource allocation, and strategic planning, LinkedIn strikes a balance between speed, quality, and cost. This approach enables continuous growth and innovation while providing reliable and valuable products to its users.

Goals of the Productive ML Project

LinkedIn's Productive ML project aims to overhaul the existing machine learning stack and drive transformative change within the organization. The key goals of this initiative include:

  1. Greater Efficiency: LinkedIn seeks to more than double modular efficiency and model productivity. By optimizing the machine learning pipeline, reducing complexity, and improving iteration speed, LinkedIn aims to enhance the efficiency of its engineering teams.

  2. Democratization of Machine Learning: With the increasing prevalence of machine learning skills across various teams, LinkedIn recognizes the need to enable engineers to solve their own problems. By providing a user-friendly and extensible machine learning infrastructure, LinkedIn fosters collaboration and empowers teams to leverage AI and machine learning techniques effectively.

  3. Extensibility and Adaptability: LinkedIn acknowledges the rapid advancement of machine learning approaches and tooling. The company aims to embrace this evolution by building an architecture that allows for the seamless integration of new technologies and techniques. This extensibility ensures that teams can leverage the most effective tools and frameworks to address specific problems.

  4. Reduction of Technical Debt: Prioritizing quality and code stability, LinkedIn aims to minimize technical debt. By investing in rigorous testing, automated workflows, and continuous refactoring, LinkedIn ensures that its machine learning infrastructure maintains a solid foundation for long-term success.

  5. Collaboration and Cross-Functional Teams: LinkedIn embraces the team of teams model, involving engineers and professionals from various departments and locations. By fostering a culture of collaboration, LinkedIn taps into diverse perspectives, promotes knowledge sharing, and drives innovation across the organization.

By achieving these goals, LinkedIn aims to revolutionize its machine learning infrastructure, deliver exceptional products and services, and provide users with an unparalleled professional experience.

Feature Marketplace and Assurance

LinkedIn places a strong emphasis on the feature marketplace, providing engineers with a platform to easily share, discover, and explore features. This marketplace enhances productivity and accelerates feature engineering, enabling engineers to focus on solving specific problems rather than reinventing the wheel.

By decoupling the data and semantics of features, LinkedIn streamlines the feature-sharing process. Engineers can simply define the name and providers of the feature, eliminating the need for duplicative efforts and promoting collaboration and knowledge sharing.

To ensure the feature marketplace's effectiveness, LinkedIn emphasizes health assurance and continuous monitoring. Engineers are provided with tools to ascertain online feature consistency, validate offline and online feature versions, and ensure proper feature availability during the deployment process. This robust assurance process guarantees that features are reliable, up to date, and consistently available to the modelers using them.

LinkedIn's feature marketplace, coupled with health assurance mechanisms, empowers engineers to benefit from shared insights, collaborate efficiently, and drive innovation while maintaining high standards of reliability and consistency.

Organizing the Project with a Team of Teams Model

LinkedIn employs a team of teams model to organize and manage the Productive ML project effectively. This model ensures collaboration, fosters innovation, and aligns efforts across diverse teams and regions.

The team of teams model consists of various engineering teams, each responsible for a specific layer or Pillar within the machine learning infrastructure. These teams are led by dedicated engineers, who work collaboratively across organizational boundaries.

The leadership team, comprising experts and visionaries like Joel Young and Beau Long, provides direction and sets the project's vision. Their role is to identify areas of potential friction, facilitate collaboration, and ensure that the project aligns with overall business objectives.

LinkedIn embraces a distributed model, involving active participation from teams across the globe, including Bangalore, Europe, and multiple teams within the United States. This diverse participation fosters a global perspective, harnesses unique expertise, and accelerates innovation.

With the team of teams model, LinkedIn creates an environment where experts from various disciplines can work seamlessly, leveraging their skills and insights to propel the Productive ML project forward. Techniques like stochastic gradient descent optimization ensure that teams stay on track and make informed decisions that benefit the overall project.

The Model as a Directed Acyclic Graph (DAG)

LinkedIn's machine learning models are represented as directed acyclic graphs (DAGs). These DAGs capture the flow of information and transformations within the model, ensuring a clear and structured representation.

The model consists of input features and transformations applied to these features. These transformations can range from simple operations like converting categorical features to more complex operations involving deep embeddings and machine learning models. Some transformations are trainable, allowing the model to learn unknowns and improve its performance.

A notional DAG representation of LinkedIn's job recommendation model reveals its complexity. The model incorporates various input features, including raw member profile text, standardized features, and deep embeddings. These features interact with each other and pass through a variety of machine learning models, including tree ensembles and random forests. The final result is a reliable recommendation model that connects job seekers with their dream roles.

Engineers at LinkedIn build models using the DAG representation, utilizing tools like IntelliJ for DSLs. This approach enhances speed, productivity, and collaboration, ensuring that engineers can focus on their domain expertise and deliver high-quality models.

Feature Engineering and Exploration

LinkedIn's machine learning infrastructure places utmost importance on feature engineering. With vast amounts of data available, feature engineering plays a critical role in extracting meaningful insights and driving effective models.

LinkedIn's engineers leverage various tools and techniques for feature engineering and exploration. The Jupiter notebook provides an interactive environment where engineers can analyze data and Visualize results using popular Python libraries like Matplotlib. This exploratory phase enables engineers to gain insights and validate hypotheses before proceeding with model development.

In addition to the Jupiter notebook, engineers have access to powerful tools for feature selection, validation, and monitoring. These tools ensure that engineers can efficiently explore data, select relevant features, and validate their impact on model performance. Continuous monitoring guarantees that offline and online versions of features remain consistent, providing assurance and eliminating potential discrepancies during the model deployment process.

Through feature engineering and exploration, LinkedIn enables engineers to leverage the power of its machine learning infrastructure to unlock valuable insights, drive innovation, and develop robust models that meet the needs of users.

Algorithms and Modeling Approaches

LinkedIn employs a wide range of algorithms and modeling approaches to solve complex problems and provide the best possible experiences for its users. The following approaches are commonly used within LinkedIn's machine learning stack:

Deep Learning with TensorFlow

Deep learning, a popular approach within the machine learning community, is harnessed by LinkedIn to tackle challenging problems. LinkedIn employs TensorFlow, a widely-used deep learning framework, to train and deploy models that leverage multi-layer neural networks.

TensorFlow enables LinkedIn's deep learning models to generate powerful features, such as deep embeddings, for user profiles and job descriptions. By incorporating advanced techniques like transfer learning, LinkedIn maximizes the predictive power of its models and enhances recommendation accuracy.

Tree Ensembles with XGBoost

To capture complex relationships and interactions within data, LinkedIn utilizes ensemble modeling techniques, particularly tree ensembles with XGBoost. Tree ensembles offer effective solutions for regression and classification tasks, enabling LinkedIn's models to discover intricate Patterns and drive accurate predictions.

XGBoost, an open-source gradient boosting library, enhances the performance and interpretability of LinkedIn's models. By combining multiple decision trees to form a robust ensemble, LinkedIn's models effectively leverage tree ensembles to address various use cases.

General Linear Mixed Models

LinkedIn's recommendation systems incorporate general linear mixed models (GLMM), a powerful framework with broad applications. GLMMs are used to generate recommendations based on specific user-job interactions or user-profile characteristics.

By incorporating user-specific features and employing linear mixed models, LinkedIn can capture individual preferences and tailor recommendations accordingly. This approach provides a more personalized experience for users, helping LinkedIn drive engagement and satisfaction.

While these are some common algorithms and modeling approaches employed by LinkedIn, the company constantly explores new techniques and tooling to remain at the forefront of the industry.

Building and Training Models with Quasar

LinkedIn employs the Quasar DSL (Domain Specific Language) framework to facilitate the building and training of machine learning models. Quasar enables engineers to define models similar to regular code, leveraging familiar programming concepts and practices.

With Quasar, engineers can easily develop complex models, transforming features, and applying machine learning algorithms. This framework streamlines the modeling workflow and abstracts away the complexities of data pipelines, feature extraction, and model deployment.

For modelers who prefer a notebook-based workflow, LinkedIn provides a Jupiter-based environment in addition to the Quasar DSL. This interactive environment allows engineers to explore data, experiment with models, and iterate quickly using powerful Python tools and libraries.

By combining the power of the Quasar DSL with the flexibility of Jupiter notebooks, LinkedIn empowers engineers to build robust and efficient machine learning models, driving innovation and delivering unparalleled user experiences.

Online Model Serving and Infrastructure

Once trained, machine learning models need to be deployed in a production environment to provide real-time recommendations and insights. LinkedIn has built a scalable and efficient infrastructure for serving models to users.

The deployment process begins when a user performs an action on the LinkedIn platform that triggers a REST call to the job recommendations Middleware. This middleware integrates various data sources, including standardized data stores, model stores, and the frame system, to Gather the necessary information for recommendation generation.

The gathered data, including a user's current job role, connections, and industry details, is bundled and sent as a query to the search cluster. The broker component of the search cluster distributes the query to multiple searchers responsible for processing the incoming data.

On the searcher side, the deployed model combines per-member components with relevant features from the forward index. The global model, comprising a deep interaction model and tree models like XGBoost, performs inference using the Quasar inference engine. This engine supports optimization techniques like laziness, columnar processing, and batching, ensuring efficient and fast model execution.

Ultimately, the results are sent back from the search cluster to the user in real-time, delivering personalized recommendations and relevant content. LinkedIn's infrastructure handles the entire process seamlessly, providing a reliable and fast recommendation system that enhances user experiences.

Keeping Models Healthy with Continuous Monitoring

Ensuring the health and reliability of machine learning models is of utmost importance to LinkedIn. The platform employs various monitoring and validation techniques to maintain the performance and accuracy of its models.

LinkedIn considers degradation of model performance as a key signal for retraining. Instead of relying on business metrics to detect performance decay, LinkedIn continuously tracks the metrics indicative of model health. By monitoring performance over time, LinkedIn can proactively trigger retraining cycles when deemed necessary, ensuring models remain accurate and up to date.

LinkedIn's online anomaly detection system plays a vital role in maintaining models' health. This system utilizes machine learning models specifically trained to detect anomalies in the model deployment infrastructure. When an anomaly is detected, the system alerts the relevant teams, promoting swift mitigation and resolution.

Additionally, LinkedIn encourages feedback from alert recipients as part of its improvement process. This iterative feedback loop fosters continuous learning, enabling LinkedIn to refine its models, improve monitoring systems, and enhance production pipelines.

By actively monitoring and validating models, LinkedIn maintains a healthy and robust machine learning infrastructure, enriching user experiences and driving successful outcomes.

Training Models and Deployment Cycles

Training machine learning models is an intricate process that requires careful planning, iterative development, and continuous evaluation. LinkedIn has established a training and deployment cycle that ensures models are robust, accurate, and aligned with users' needs.

The training process starts with data preparation, including feature generation and input optimization. LinkedIn leverages the vast dataset and various modeling techniques to generate meaningful features that capture the essence of users' profiles, behaviors, and preferences.

For training the models, LinkedIn employs algorithms like deep learning, tree ensembles, and general linear mixed models. Each algorithm caters to specific use cases, providing flexibility and accuracy.

Once the models are trained, they are deployed to production with the help of the Quasar framework. LinkedIn places a strong emphasis on A/B testing to scientifically validate the performance of new models against existing ones. By comparing product metrics, relevance metrics, and other key performance indicators, LinkedIn ensures that the new models positively impact users' experiences.

The training and deployment cycles are continuous processes, allowing models to evolve and adapt to changing user needs. LinkedIn leverages user feedback, anomaly detection, and data-driven insights to drive continual improvements in model performance, relevance, and overall user satisfaction.

Through meticulous training, comprehensive testing, and iterative deployment, LinkedIn ensures that its machine learning models deliver exceptional value to users, driving engagement, and fostering professional growth.

Scaling Up Machine Learning Skills at LinkedIn

LinkedIn recognizes the importance of fostering a culture of machine learning expertise across the organization. To ensure that its engineering teams can leverage AI and machine learning techniques effectively, LinkedIn has taken initiatives to develop and scale up these skills.

The AI Academy is a comprehensive training program offered by LinkedIn to provide employees with the skills and knowledge required to excel in machine learning. The training is not limited to engineering teams; it extends to managers who oversee machine learning projects. By offering a course specifically tailored for managers, LinkedIn equips them with a deeper understanding of machine learning principles, application feasibility, and success metrics.

The AI Academy covers critical topics, including the solvability of problems, managing machine learning projects, evaluating success metrics, and conducting effective A/B tests. Through this training, LinkedIn aims to ensure that machine learning projects align with overall business goals and deliver measurable value.

LinkedIn is keenly aware of the challenges associated with integrating managers into the machine learning process. However, the company believes that by providing managers with the necessary knowledge and skills, they will become enablers rather than obstacles to the success of machine learning initiatives.

By scaling up machine learning skills and knowledge across the organization, LinkedIn empowers its teams to leverage AI and machine learning effectively, fostering a culture of innovation and collaboration.

Avoiding the Support Trap

LinkedIn is committed to avoiding the "support trap" that can hinder the success of machine learning projects. The support trap occurs when adoption of a new infrastructure component outpaces the support capacity of a team.

LinkedIn recognizes the importance of handling customer growth in a way that supports both product adoption and the ability to deliver value. By working with specific partners who accept some of the risks associated with early adoption, LinkedIn ensures that support capacity aligns with customer acquisition.

The benefit of this approach is that early adopters gain a competitive advantage and contribute to the success of the project. However, LinkedIn acknowledges that challenges may arise, particularly if support capacity lags behind customer acquisition. In such cases, LinkedIn must balance the need for rapid growth with the need to maintain appropriate levels of support and stability.

LinkedIn's commitment to rapid iteration, customer feedback, and efficient collaboration enables the company to proactively address and navigate the support trap. By continuously evaluating support capacity and expanding teams when necessary, LinkedIn provides a stable and reliable machine learning infrastructure that drives customer success.

Conclusion

LinkedIn's complete rethink of its machine learning approach represents a significant step forward in the Quest for innovation and excellence. By leveraging AI and machine learning, LinkedIn has developed a range of products that empower professionals around the world, connecting them, enabling growth, and driving success.

The journey to overhaul LinkedIn's machine learning stack has not been without its challenges. However, through a team of teams model, an emphasis on extensibility, and a commitment to balancing speed, quality, and cost, LinkedIn has navigated these challenges and emerged with a scalable and robust infrastructure.

LinkedIn's focus on feature engineering and exploration ensures that its models remain accurate, insightful, and valuable to users. By continually monitoring and validating models, LinkedIn maintains a high level of performance, delivering accurate recommendations and up-to-date insights.

Through the Productive ML project, LinkedIn has fostered a culture of collaboration, skill development, and innovation across the organization. By scaling up machine learning skills and providing comprehensive training, LinkedIn empowers its teams to leverage AI and machine learning effectively, driving the success of its platform.

LinkedIn's machine learning journey continues, fueled by a commitment to excellence and a desire to push the boundaries of what is possible. Through continuous refinement, adoption of emerging technologies, and a dedication to supporting its customers, LinkedIn remains at the forefront of the machine learning landscape.

With a deep understanding of the importance of AI and machine learning, LinkedIn is poised to reshape the future of professional networking, epitomizing the transformative potential of technology in the professional world.

Highlights

  • LinkedIn is rethinking its machine learning approach and infrastructure to drive innovation and enhance user experiences.
  • Key products like "People You May Know" and "Feed" are powered by AI algorithms

Find AI tools in Toolify

Join TOOLIFY to find the ai tools

Get started

Sign Up
App rating
4.9
AI Tools
20k+
Trusted Users
5000+
No complicated
No difficulty
Free forever
Browse More Content