随着组织正在转移到数据驱动的文化,他们也正在寻找新的方法来利用数据来进行更智能的决策和更好的业务成果。许多公司的主要重点是人工智能(AI)和machine learning(ML)以及这些技术如何解锁其数据深处的见解。在高峰期,AI和ML提供了预测智能,以优化操作并根据实时趋势调整策略。

Download Now: Free HTML & CSS Hacks

但是,AI和ML并不简单地发生。他们需要采用仔细和测量的方法来实现为预测分析提供动力并有效地将其推向组织所需的算法。

为了启动这一过程,组织正在转向进行战斗的软件开发方法,bob电竞官方下载DevOps,重组其模型创建的人工智能nd ML applications. The result is commonly referred to as MLOps. This post will cover the MLOps lifecycle, how it compares to DevOps, and the best practices and challenges for teams implementing MLOps.

MLOps vs. DevOps

DevOps是agile approachto development that combines the development and operations teams into one unit: the DevOps team. Where before the development team would hand off the application for the operations team to run, nowengineers从这两个学科起见,从软件计划和创建到部署和操作都可以顺利进行。bob电竞官方下载

MLOPS与DevOps图,在正常DevOps生命周期中未看到机器学习实验阶段

Image Source

使用MLOP,基本的工作流程和目标是相同的。但是,对机器学习项目的重视确实引入了新的要求和细微差别,而DevOps对一般软件应用程序的关注不纳入。bob电竞官方下载

MLOPS和DEVOPS之间的主要区别在于,MLOP为DevOps生命周期增加了一个额外的阶段。此阶段着重于机器学习要求,并涉及找到相关数据并培训这些数据集的算法以返回准确的预测。

ML DevOps diagram showing the phases of the MLOps lifecycle between machine learning, development, and operations

Image Source

否则,如果找不到合适的数据集或无法培训算法以提供所需的结果,那么继续开发和操作阶段就没有意义。

Other differences between MLOps and DevOps center on the fact that data is the main focus of the application, so data scientists take the place of software developers in MLOps. They are responsible for locating relevant data, writing the code that builds the ML model, and training the model to produce the expected results. Once the model is validated and the application is ready to deploy, it is handed off to ML engineers to launch and monitor.

In addition, version control now extends not only to the code but to the data sets used for analysis and the model's findings. All these components are required to answer any questions on how the model returned a result for auditing purposes.

Finally, monitoring the live application is important not only to ensure availability and performance like in the DevOps model. Under MLOps, engineers also need to watch for model drift, which is when new data no longer fits the model's expectations and skews results. To combat this, ML models need to be retrained regularly.

This video from Kineto Klub reviews the definition of MLOps and how it compares to DevOps:

Now that you understand the differences between DevOps and MLOps, let's examine some best practices for teams transitioning to an MLOps model.

MLOP最佳实践

The following best practices will help your team be more effective in the MLOps lifecycle.

Reusability

DevOps places a heavy emphasis on repeatable processes, and MLOps follows the same workflow. Following a common framework between projects improves consistency and helps teams move faster because they are starting from familiar ground. Project templates provide this structure while still allowing for customization to meet each use case's unique requirements.

In addition, central data management speeds the discovery and training phases of MLOps by consolidating organizational data. Common strategies for achieving this centralization includedata warehousingand thesingle source of truthapproach.

Ethical Considerations

A consistent challenge in AI and ML models is the perpetuation of bias. If ethical principles aren't applied from the outset of the project, these models can return results with the same bias as contained in the data sets they are trained on.

保持对在某些情况下如何存在偏见以及如何反映在数据中的认识,将有助于团队纠正训练和操作模型时的任何有偏见的结果。

Resource Sharing and Collaboration

高度协作的d integrated pipelines like MLOps cannot function effectively with siloes. This makes it critical to foster a culture of resource sharing and collaboration in your team. Lessons learned from each project cycle should be captured and disseminated so the entire team can adjust their strategies for the next sprint.

为了促进这种知识共享,应在Wiki或其他集中式存储库中进行标准化并使文档访问,以便当前和未来的队友可以学习团队的最佳实践。这些记录还提供了有关您组织的MLOP策略如何发展的参考。

专业角色

Successful MLOps pipelines rely heavily on data scientists and machine learning engineers to build, deploy, and operate machine learning applications. The data scientist must bring deep expertise in practical applications of data and the organization's data sets. The machine learning engineer must have both data and IT operations skills, includingsecurity和体系结构注意事项。

Given the broad skills and experience necessary, it is more effective to onboard or transition full-time employees to ensure the many duties of these roles are supported versus trying to add these responsibilities to another data professional's job description.

MLOP实施的挑战

Though MLOps offers a repeatable and efficient pathway to achieve predictive intelligence for your business, it also comes with challenges to consider when implementing an MLOps model.

1.可行性由数据决定。

数据收集和培训阶段已添加到传统的主要原因DevOps pipelineis that these are a prerequisite before building the application. You may find that the questions you are looking to answer with the ML model can't be answered with the data available to your organization. Or, the model can't be trained to return reliable results.

In either case, there is no point in moving forward when the building blocks of the ML model are not present. Always keep in mind that feasibility is dictated by the data and that not every project will reach the finish line. It's better to have fewer but more trusted models than to produce unreliable insights for your organization.

2.监测对于确保预测保持可靠性更为重要。

As discussed before, model drift is a serious concern in ML applications. Data trends can change over time, and with many organizations buildingdata pipelineswith the ability tostream datain real-time, that change can happen in a matter of seconds.

Strong monitoring strategies will help ML engineers initiate retraining to prevent model drift before predictions are too heavily skewed. Monitoring also mitigates the more traditional concerns of outages and performance loss that are the focus of the DevOps model.

3. Deep data expertise is needed to achieve the best results.

尽管两者在MLOP中都起着至关重要的作用,但数据科学家在某些方面超过了机器学习工程师。为什么?因为数据收集和模型培训的初始阶段将成为项目。

Deep data expertise goes beyond knowing data types and ML algorithms, though these are certainly important. The data scientist needs to understand the catalog of data available in the organization and which data sets are better suited to certain questions than others.

他们还将确定最佳使用模型设计以及模型如何解释数据中的不同趋势。这不仅决定了MLOPS应用程序是否可以向前推进,而且还会直接影响模型提供的洞察力的可靠性。

MLOps provides the path to advanced analytics.

As organizations look for new ways to leverage the vast amounts of data produced and collected every day, they are turning to advanced use cases like AI and ML. To achieve the predictive intelligence these technologies offer, they have repurposed their existing DevOps workflows into new MLOps models. Though this transition poses challenges, with established best practices and a focus on quality data, MLOps offers a viable pathway to produce advanced insights for your organization at scale.

New Call-to-action

CSS简介

最初发布于2022年4月1日,上午7:00:2022年4月1日更新

Topics:

开发人员操作