Scale AI: Do large models still need data annotation?

We compiled an article about Scale AI in July 2021 , but in the past period of time, earthquakes of magnitude 10 occurred every day in the AI ​​industry, and the industry value chain has also changed, so we think it is necessary to re-examine the previous research. An important company, so take out Scale AI for re-research.

Scale AI was founded in 2016 by Alexandr Wang and Lucy Guo, who has since left the company. Scale AI became one of the unicorns in 2019, with a current valuation of $7.3 billion and an ARR close to $300 million. The core business of Scale AI is data labeling. It started from autonomous driving scenarios, and then cut into government, e-commerce, robots, large-scale models and other scenarios, corresponding to the emergence of several major opportunities in the AI ​​​​industry in the past. Benefiting from Alex's super personal ability and super team execution, Scale AI can quickly capture opportunities when every major trend comes, launch corresponding products, and quickly achieve a very high market share in subdivided fields.

Currently, Scale AI is aggressively cutting into the field of MLOps and LLM, providing various tools, platforms and services. Including Scale Catalog, a picture generation tool for e-commerce scenarios, Scale Spellbook, a large language model developer tool platform, and Scale Synthetic, a synthetic data product, etc. However, according to the survey results, these emerging businesses are just some attempts by Scale AI to find the second growth curve, and the product sales are not ideal. In the end, it is the data labeling business that can have stable demand and contribute the main income.

In addition to updating the company's business situation, we also discussed important issues such as what role data annotation plays in large models, the business model of data annotation, Scale AI corporate governance issues, and the future development of Scale AI.

Additionally, we think Scale AI is an excellent niche to observe opportunities in the AI ​​industry. Once there are new trends in the industry, they will be reflected in Scale AI's product line and will be publicly visible. Scale AI's product updates are very noteworthy.

The following is the directory of this article

It is recommended to combine the main points for targeted reading

01 Industry

02 products

03 team

04 Competition

05 Current conclusions and judgments

01. Industry

Industry introduction

Data Labeling is the core business of Scale AI. An upstream stage of model development, data labeling involves identifying raw data and then adding one or more labels to that data. Data types include structured data and unstructured data, the latter including images, video, 3D (LiDAR, radar, etc.), text, and audio, among others.

Source: Scale AI official website

The core of data labeling is quality and efficiency. For data labeling client companies, data labeling is not the core business of the company, and they have a strong willingness to outsource. Customer labeling data is mainly through internal self-built teams, crowdsourcing platforms, and cooperation with third-party data labeling startups. In addition to Scale AI, players on this track include Dataloop, SuperAnnotate, Labelbox, Snorkel, V7, Appen, etc. Different departments within the same client company may choose different data to mark players according to different needs and scenarios.

In the early days, all data was manually labeled to construct and accumulate training data sets for machine learning models. Although time-consuming and costly, manually labeling data does have advantages in terms of accuracy and so on. Data labeling companies often look for suitable data labeling personnel in countries or regions with relatively low labor prices such as the Philippines, Kenya, and Venezuela.

With the development of machine learning models, the accuracy of automated data labeling improves, and models can be used to assist manual labeling, such as model preprocessing data and then sending it to labelers; or humans as auditors, reviewing and correcting the labeling results given by the model etc. Compared with purely manual labeling, AI-assisted labeling speeds up data labeling. Currently, data labeling companies such as Scale AI are working hard to reduce the proportion of manual participation in the data labeling process.

The above two solutions are the main forms of data labeling at present. As for whether data labeling can be done by models in the future, our current judgment is No. Mature scenarios may be possible, but new scenarios will always appear in the future, and new things often It is necessary to accumulate data and examples through manual labeling, and then it is possible to train a model that can automatically complete the labeling.

Does the large model still need data labeling?

Previously, machine learning required supervised learning and required labeling of large amounts of data. As the model becomes larger, the demand for data volume increases, the time and cost of labeling data become uncontrollable, and the production speed of high-quality labeling data cannot meet the needs of large models. However, after the emergence of unsupervised learning, machine learning does not require a purposeful training method, nor can it predict the results in advance, so there is no need to label data. Reinforcement learning does not require data labeling. The feedback of reinforcement learning is not through labels or values, but through a reward mechanism to learn a series of behaviors.

The pre-training model has achieved a leap from supervised learning to unsupervised learning, and OpenAI's GPT-1 to GPT-3 have also adopted this route. Therefore, in the past period of time, many people have worried about the value of data annotation in the era of large models. However, after the emergence of ChatGPT, this concern has been alleviated. ChatGPT uses reinforcement learning and human feedback to make the model better consistent with human instructions, that is, RLHF (Reinforcement Learning from Human Feedback), which involves a lot of data annotation Work.

The data labeling of RLHF is also different from the previous simple data labeling work done with low-cost labor. It requires very professional people to write entries and give high-quality answers that conform to human logic and expression for corresponding questions and instructions. . It is said that OpenAI recruited dozens of PhDs to do RLHF labeling. Scale, as the upstream supplier of OpenAI, also recruited dozens of PhDs to provide such services for OpenAI. The specific division of labor is that Scale completes more labeling actions. , while OpenAI is more about quality testing. Annotated data is one of the reasons why ChatGPT's performance is different from other competitors. A Google technical expert also said that after ChatGPT came out, Google is also reflecting on the issue of data labeling.

02. Products

Product Update

The core business of Scale AI is data annotation, in addition to a very rich product line. Products are mainly divided into 4 categories: data annotation (Annotate), management and evaluation (Manage & Evaluate), automation (Automate) and synthesis (Generate).

Scale started from labeling in the field of autonomous driving, and performed well in industries such as autonomous driving and maps. Two years ago, 80-90% of the company's orders came from autonomous driving (2D, 3D, lidar, etc.), and the proportion has increased in recent years. decline. In fact, the development and sales of Scale AI's labeling products have a lot to do with the underlying industry trends and the development of various industries. After autonomous driving, Scale's data labeling orders also come from the government, e-commerce (retail catalogs), and robots. , large model (RLHF) and other fields, corresponding to several major trends and opportunities in the AI ​​​​industry in the past few years. When each major trend is coming, Scale can keenly capture the signal, quickly recruit corresponding talents, launch corresponding products, and quickly achieve a very high market share in segmented fields.

In addition to data annotation, noteworthy products include: Scale Catalog, Scale Spellbook, Scale Synthetic.

• Scale Catalog  is mainly aimed at e-commerce and retail enterprises. In addition to providing labeling services, it can also automatically generate product maps. It is a core product of Scale's entry into the field of Generative AI applications.

• Scale Spellbook  is a business that Scale has recently invested heavily in. It brings together Scale's core talents to build a tool platform for to developers based on large language models.

• Scale Synthetic  is a tool for synthesizing data. As model parameters continue to increase and modalities continue to enrich, the requirements for data volume are getting higher and higher. The real data volume can no longer meet the demand, and synthetic data has begun to attract attention.

From the perspective of Scale's product expansion, Scale is very aggressively entering the field of MLOps and LLM, providing various tools, platforms and services. However, these are just some attempts by Scale to find the second growth curve. The product sales are not ideal. In the end, it is data labeling that can have stable demand and contribute the main income.

Customers and Business Models

Scale's labeling workers are mainly recruited from countries with relatively low wages such as Venezuela, Kenya, and the Philippines. The customers are mainly American enterprise companies. The business model is like global arbitrage, with high gross profits.

Source: Scale AI official website

The list of major customers is as follows:

In terms of business model, Scale's official website provides standardized pricing for each product, and the pricing model is the consumption-base model. For example, the starting price of Scale image is 2 cents per picture, and 6 cents per text; the starting price of Scale Video is 13 cents per video frame, and 3 cents per text; the starting price of Scale Text is 5 cents per task, and each text Callouts are 3 cents; Scale Document Al starts at 2 cents per task and 7 cents per callout. In addition, there are charging methods for enterprises, that is, charging is based on the data volume and services of specific enterprise-level projects.

Since most of Scale's customers are enterprise customers, most of the income is actually project-based income, and the unit price of customers ranges from hundreds of thousands of dollars to tens of millions of dollars. Scale's 2022 revenue is expected to be $290 million, with a gross profit of about 70%. In April 2021, the company completed a $325M Series E financing. Investors include Dragoneer, Greenoaks, Tiger Global, etc., with a valuation of $7.3B.

03. Team _

Scale AI was born in Y Combinator entrepreneurial project in 2016. The founders are Alexandr Wang and Lucy Guo (Lucy left Scale AI in 2018 and retained 6% of the shares). The two founders have profound technical backgrounds.

Alexandr Wang was born in 1997. He joined Quora in 2014 and met Lucy Guo on Quora. He received offers from many Silicon Valley technology companies in high school, and later studied machine learning at MIT. All of his electives were graduate-level computer science. Course, a year later resolutely dropped out of MIT. In 2016, Alexandr Wang and Lucy Guo founded Scale during YC.

Alexandr Wang won the Bronze Medal in the United States Mathematical Talent Search (USAMTS) in 2011 and the Gold Medal in 2012; was placed in the top 30 nationally in the USA Mathematical Olympiad in 2013 and placed third in the Who Wants to Be a Mathematical Competition ; participated in the US National Physics Olympiad (USAPhO) in 2014 and reached the semi-finals, and in 2018 was on the "30 under 30" list.

Alexandr Wang's resume is very impressive, but everyone's evaluation of him is mixed. He is very smart, confident, capable, and good at maintaining external relationships, spending a lot of time building relationships with key people in Silicon Valley. He is also very good at branding and marketing, shaping a good personal image and corporate image. It is believed that the difference between Scale and other competitors is mainly due to Alex's publicity and hype, which has brought a large number of orders to the company. But perhaps because of his young age, Alex has relatively little experience in managing the company, and the internal management of the company is poor. Many talents are lost or unwilling to join Scale, and there are also various contradictions within the company. We heard very negative comments in interviews with several departing executives, but we also felt some employees' appreciation for Alex from the heart in many employee interviews.

In terms of the team as a whole, Scale's overall execution is very strong, its work rhythm and corporate culture are very radical, and it prefers to recruit fresh graduates from top universities. They are smart, hardworking, strong in execution, and willing to work overtime. Scale's "volume" is very famous in Silicon Valley.

04. Competition

Scale's competitors include: the company's self-built data labeling team; data labeling services from technology giants such as Google, Microsoft, and Amazon; and data labeling start-ups.

Type one:

In-house self-built data labeling team

Because some data is more sensitive, some companies will choose to build their own data labeling team as a supplement to outsourcing solutions such as Scale. For example, Airbnb uses internal data labeling products to label private data and use them in the company's internal machine learning models, but for insensitive data, Airbnb usually outsources to third-party suppliers for labeling.

There are three reasons:

•  Data annotation by third-party suppliers can be cheaper than Airbnb's internal self-built team;

•  Third-party suppliers are flexible and can be flexibly adjusted according to Airbnb's needs;

•  Data labeling is not the key business of Airbnb, and tools from third-party suppliers can complete labeling more accurately and efficiently.

Type two:

Tech giants such as Google, Microsoft and Amazon

For Scale, these tech giants are both customers and competitors. Tech giants such as Google, Amazon, and Microsoft have an advantage over any other provider because of economies of scale and the broad product collections of leading companies. For example, Scale processes and labels data on AWS. If customers want to store the data labeled by Scale in S3, they need to enable access permission for Scale, and then Scale will put the labeled data into the customer's S3 storage space. Serial operations incur additional costs. However, if customer data is already stored on the cloud platforms of Google, Amazon, and Microsoft, and their data is used to label products and services, there is no need to perform steps such as access authorization and mobile data.

In addition, major technology companies such as Microsoft, Amazon, and Google hope that customers can solve all problems and purchase all their products and services on one platform, so they will give some discounts for a single product in a package of products, or even directly provide free tools, which creates competitive pressure on Scale. However, most of the major technology companies such as Microsoft only provide software and tools, and do not provide human services, resulting in customers having to undertake the human work themselves. Scale provides manual labeling data and other human services, and it also has certain unique advantages in competing with major technology companies.

Type three:

Data annotation start-up company

Such as Dataloop, SuperAnnotate, Labelbox, Snorkel, V7, Appen, etc.

Snorkel

Snorkel provides a large number of templates to allow users to create annotation tasks, and also provides hosting services. Snorkel has great integration with TensorFlow, Kubernetes, and DAS. Both Snorkel and Scale are relatively large suppliers in the field of data labeling. Some experts believe that Snorkel will not be on the same track as Scale in the future, but both will have good growth. Compared with Scale, the advantage of Snorkel is that it is more focused on text and NLP, and the cost is lower, so if users only process text data, they generally choose Snorkel instead of Scale. The disadvantage of Snorkel is that the processing power of video, image, map, etc. is very limited.

SuperAnnotate

SuperAnnotate is one of the important suppliers in the data annotation industry. Feature-rich, allowing users to extract different labels in formats such as Python, perform bulk searches on images using SQL, and merge SQL with databases. The advantage of SuperAnnotate over Scale lies in the medical industry and workflow. On the medical side, SuperAnnotate is HIPAA compliant, while Scale is not. SuperAnnotate is more capable of creating workflows, such as providing instructions, and Scale is catching up in this regard, but it has not reached the level of SuperAnnotate. But overall, the disadvantage of SuperAnnotate is that the annotation quality is not as good as Scale.

Labelbox

The business model of Labelbox is slightly different from Scale. Labelbox provides a platform for users. Users can choose to label their own data or use other services, but customers need to use the Labelbox platform as an internal data labeling tool. Labelbox has passed the US Department of Defense security review and has also cooperated with various organizations. For example, Labelbox is a partnership with GCP and is promoting GCP Cloud and Google Cloud.

05. Current conclusions and judgments

why optimistic

1. Determination of data labeling outsourcing needs

The demand for data labeling outsourcing is obvious, which gives startups a lot of room to play.

On the one hand, from the perspective of customers, data labeling is a dirty job for employees of AI companies, which will take up a lot of their time and distract them from core links such as algorithms. From a subjective point of view, they are unwilling to put Time is spent on labeling.

On the other hand, from the perspective of ROI, most of the data labeling work does not have high requirements for labelers, that is, the work that can be done by American workers can also be done by Kenyan workers, and the quality difference will not be great. Therefore, if the data is not particularly private, or other capabilities such as semantic understanding of RLHF scenarios are not required, the ROI of completing the labeling work through a third-party labor force in low-cost countries and regions is higher. Therefore, the demand for data labeling outsourcing is very obvious, and startups have long-term opportunities.

2. The data mark the top players on the track, with strong head effects and brand effects

Scale is the absolute leading player in the data labeling track. If we think that manual labeling and "automated + manual" labeling methods will exist for a long time in the next 5-10 years, then Scale will always maintain the lead at present. From the perspective of the most real customers and orders, most enterprise customers in the United States only recognize Scale as their third-party data labeling service provider. customer base.

When Scale's sales team was pitching enterprise customers, the only competitor they encountered was the solution of "self-built teams within large factories", and almost no other startups. You'll only encounter other startups in the SMB market or sales to off-the-charts. The head effect and brand effect are very obvious. One more point of view can be added about the brand effect. A customer said this sentence: "Scale and other data labeling companies are like the relationship between iPhone and Android." The formation of Scale's brand effect is also inseparable from the strong PR of Alex himself and his team and marketing capabilities.

3. Scale effect has appeared

The data labeling track has a scale effect. Customers focus on data labeling in terms of "quality" and "efficiency". Since data labeling is not a high-tech job, experience plays a key role in improving quality and efficiency. The experience here includes workers' experience in labeling data, as well as Scale's experience in managing the entire process and system. To a certain extent, experience is also closely related to scale and quantity. The larger the scale, the more labeled data, the more mature and rich the experience, and the higher the quality and efficiency of labeled data.

As the leading player on the track and cooperating with enterprise customers, Scale's order volume and data volume are much larger than other competitors. In addition, Scale can quickly enter emerging markets when each wave of trends just emerges. If you gain "experience" earlier, it will be difficult for subsequent competitors to catch up.

On the other hand, Scale has accumulated the experience of manual labeling into an automated solution. In the early stage of industry development, manual labeling was adopted. When the industry matures, it has been able to train an automatic labeling model that adapts to data in specific fields, becoming "automation + Manual" solution greatly improves efficiency. A large enough order volume and data volume can also optimize the labeling model more quickly and efficiently. Therefore, the scale effect of Scale is very obvious.

4. The comprehensive strength and execution ability of the founder and the team

About Alex is also introduced in the team section, a very smart, aggressive, competitive young man, and  Alex not only has a strong talent in technology, but also has a strong ability in business, such as operations, branding, marketing , sales, social skills, etc., with strong comprehensive strength. The comprehensive ability of the Scale team is also very good, especially the operational ability, and the management of the entire process and system of data labeling. Scale's process and management system, experience management effect and efficiency are significantly better than other competitors, including how to manage data workers, how to assign them labor, how to motivate or punish, how to check quality, how to hand over data to customers, How to serve customers, and how to relabel data to improve labeling quality based on customer feedback, etc., the whole link is very complicated. Scale's top students can handle the whole process well, and every link is extremely efficient, smooth and accurate. And Alex is also hands-on, or personally supervised, on many things. The overall quality and execution ability of the team are very strong.

why not optimistic

1. Enterprise management risk

The founder and the team are both a bright spot and a risk. As mentioned in the team section, after we made a reference to the founder and the team, we found that everyone’s evaluation of Alex was very extreme and fragmented. Those who admired Alex thought he was an all-around genius boy, while those who didn’t appreciate Alex felt that Alex was incompetent in company management. Very big question. This may be the first project with such fragmented reference results encountered in our research in the past two years.

In terms of corporate management and corporate culture, Scale gives young people enough development opportunities and development space, enough fast and clear opportunities to rise, and enough incentives, but at the same time, there are many problems in dealing with the relationship between old employees, so among them There are major conflicts and contradictions. In addition, Scale's high-intensity work and aggressive management style have also led to a serious brain drain, or have discouraged many talents from choosing a company. We believe that corporate management and corporate culture are the biggest risks of Scale.

2. Demand and growth risks

The demand for data annotation is greatly affected by the specific industry cycle. When each wave of AI trend breaks out, there will be an extremely steep growth, but when the industry develops stably or matures, the growth curve will start to flatten until the next AI trend. With the outbreak of major trends, demand and growth fluctuate greatly. In addition, Scale is mostly project-based, and the quantity, cycle, stability and order amount of the project have great uncertainty and are difficult to predict. The data labeling business itself is manpower-heavy, and requires a lot of people to complete the labeling work. It belongs to the construction team business. It is difficult to improve the efficiency of personnel in the short term, and it is difficult to achieve continuous compound interest.

On the other hand, the Scale team has been working hard to find the second growth curve, which has been involved in MLOps, LLM tool, Generative AI, etc., but the current results are not satisfactory, and the second stable growth curve has not yet been found. If you rely on data labeling business for a long time, the ceiling will be limited. If there is no room for imagination and stable growth, then the company will come to the secondary market and will bear the risk of undervaluation.

3. Supply-side risks

The countries and regions where Scale has previously deployed on the supply side have experienced rapid labor costs in recent years. The most typical cases are the Philippines and other Southeast Asian regions. After the rise in labor prices in the Philippines, Scale rarely recruits people in the Philippines. Supply-side costs have risen, and Scale’s gross profit space has been squeezed. Whether the gross profit has risen steadily is one of the most important criteria for investors after the company’s listing. If the gross profit declines, it is a very unfavorable signal. In addition, the standardization and stability of the supply-side recruitment process are also issues of concern to us.

Finally, I would like to add a point of view. We believe that Scale is an excellent ecological niche for observing opportunities in the AI ​​industry. Once there are new trends in the industry, Scale can quickly capture signals and quickly launch corresponding data annotation products, which are publicly visible. Scale's product innovations deserve continued attention.

Guess you like

Origin blog.csdn.net/weixin_48827824/article/details/130058847
Recommended